Questions tagged [experiments]

An experiment is a procedure carried out to support, refute, or validate a hypothesis. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated.

An experiment usually tests a hypothesis, which is an expectation about how a particular process or phenomenon works. However, an experiment may also aim to answer a "what-if" question, without a specific expectation about what the experiment reveals, or to confirm prior results. If an experiment is carefully conducted, the results usually either support or disprove the hypothesis. According to some philosophies of science, an experiment can never "prove" a hypothesis, it can only add support. On the other hand, an experiment that provides a counterexample can disprove a theory or hypothesis, but a theory can always be salvaged by appropriate ad hoc modifications at the expense of simplicity. An experiment must also control the possible confounding factors—any factors that would mar the accuracy or repeatability of the experiment or the ability to interpret the results. Confounding is commonly eliminated through scientific controls and/or, in randomized experiments, through random assignment.

39 questions
12
votes
2 answers

Book keeping of experiment runs and results

I am a hands on researcher and I like testing out viable solutions, so I tend to run a lot of experiments. For example, if I am calculating a similarity score between documents, I might want to try out many measures. In fact, for each measure I…
10
votes
4 answers

How to debug data analysis?

I've came across the following problem, that I recon is rather typical. I have some large data, say, a few million rows. I run some non-trivial analysis on it, e.g. an SQL query consisting of several sub-queries. I get some result, stating, for…
8
votes
2 answers

A/B testing: How to calculate p-value on post test segments?

My question on A/B testing is about doing post test segmentation analysis. For example: I run an A/B test on my website to track bounce rate. On the treatment group, i put a video to explain my company. On the control group i put just plain…
jxn
  • 233
  • 2
  • 5
5
votes
1 answer

experimental design in R project

I want to know of any repositories that contain complete experimental design in R covering basic test and analyses? I want to take a top-buttom approach to learn step by step through a real project how that works. Do you know of any places to find…
user18602524
  • 171
  • 4
5
votes
3 answers

Can we use difference-in-differences with a biased A/B test?

We noticed we had a biased sample in our A/B test and was wondering if difference-in-differences would help us make valid conclusions about the data, or if there was another way to proceed. We ran an new experiment on our site, where we offered 50%…
Huey
  • 151
  • 2
5
votes
2 answers

AB testing : When AA testing doesn't work

After 6 months of AB testing on our CRM tool (Oracle Responsys, but this could be true with anyone), the test exhibited some weird results so we decided to pause everything, and to make some good old AA testing. AA testing consists in dividing…
WNG
  • 191
  • 5
5
votes
3 answers

What are the methods to ensure that the population split for A/B test is random?

Before launching an A/B test, what are the methods to ensure that the population split in control and target group is random for a particular label say, purchase rate.
vick
  • 91
  • 1
  • 4
4
votes
2 answers

Recommendations for storing time series data

As part of my thesis I've done some experiments that have resulted in a reasonable amount of time-series data (motion-capture + eye movements). I have a way of storing and organizing all of this data, but it's made me wonder whether there are best…
lmjohns3
  • 588
  • 6
  • 19
4
votes
1 answer

Timing sequence in MapReduce

I'm running a test on MapReduce algorithm in different environments, like Hadoop and MongoDB, and using different types of data. What are the different methods or techniques to find out the execution time of a query. If I'm inserting a huge amount…
syed
  • 41
  • 1
3
votes
2 answers

How do I learn experimental methodology? When is it relevant?

I just graduated in Computer Science, with a very theoretical background but without any kind of Data Science or Artificial Intelligence experience, and I working on my own to discover those two fields. More precisely, I try to work on a toy…
3
votes
1 answer

Attributing causality to single quasi-independent variable

Apologies if this isn't the correct place to ask - I'm not sure if this fits best with Stats or Data Science. I'm using analytics to help marketers identify attributes of their users correspond to successful conversions (such as someone buying a…
3
votes
1 answer

Estimating Variance Reduction Resultant from Additional Data

I couldn't quite think of how best to title this, so recommendations are welcome. Same goes for the tags (I don't have the reputation to use the tags that I thought were appropriate). The question is this: "Suppose you have N pairs of observations,…
Lt. Surge
  • 131
  • 2
3
votes
1 answer

Taxonomy of train-test split approaches

I am looking for as close as possible for a exhaustive taxonomy of each train-test split approach. For example, the 3 main splits that come to mind are: A non-time based problem - would lead you to a random, maybe stratified train-test split. A…
GooJ
  • 445
  • 2
  • 11
2
votes
1 answer

Feature selection for two seperate datasets

Currently, I'm doing research with experimental data. The data comes from two experiments with two slightly different tasks, but with the same setup in a VR environment. Both experiments were done with different populations but with same two groups…
2
votes
1 answer

Structuring experiment/training data with months in mind

We're using a whole year's data to predict a certain target variable.The model works like data - OneHot encoding the categorical variables - MinMaxScaler - PCA (to choose a subset of 2000 components out of the 15k) - MLPRegressor. When we're doing a…
lte__
  • 1,379
  • 5
  • 19
  • 29
1
2 3