Testing goodness of fit using Kolmogorov-Smirnov test

Question

I want to check if two probability distributions (experimental and theoretical) are same. The distributions are not normal distributions so I decided to use KS test. I used the MATLAB function KStest2 and got p-value = 1! Now, it means that I can't reject the null hypothesis that the two distributions are same. I have two main concerns:

Does it mean I can accept the null hypothesis? I'm confused about the statement 'fail to reject the null hypothesis'
What is the p-value for the hypothesis that distributions are same. Can I calculate it as 1-p? As I'm interested in testing whether my theory is correct and want to give a p-value for that.

https://se.mathworks.com/help/stats/kstest2.html

Why do you use kstest2, which is for two2sample problems. You want the one-sample form h = kstest(x,Name,Value). — kimchi lover, Aug 21 '20 at 13:47
I have two samples, i want to test whether they are same distributions. I'm not interested in testing if they are standard normal distributions. Kstest2 is more suitable for me. — cat's eye, Aug 21 '20 at 13:50
"I want to check if two probability distributions (experimental and theoretical) are same" means you have a one-sample problem. Your misreading of the textbook is the explanation of your absurd $p=1$ result; you musing that $p$ means $1-p$ is proof that you don't really know how to use this tool. — kimchi lover, Aug 21 '20 at 14:00
I don't really understand how to use the tool, isn't it obvious as I asked a question here! I have two distribution with different means and standard deviations. I want to test if they are similar. Why is it not a two-sample test? It will be helpful if you elaborate your comment or provide useful links. As I'm looking for help and not trying to show how knowledgeable I'm. — cat's eye, Aug 21 '20 at 14:12
In clearing up (i) whether it's one or two samples & (ii) why you got a p-value of 1, it may be worth including the code you used in a code block. (Feel free to replace a dataset if it's too large or confidential.) — J.G., Aug 21 '20 at 15:13

score 2 · Answer 1 · answered Aug 21 '20 at 15:11

Let's review some basics on which you may be confused.

In a p-value test we have a hypothesis, called the null hypothesis, from which probabilities are computable, then use a p-value to quantify how well the hypothesis "fits" some data. The p-value is the probability, conditional on the null hypothesis, that the data would be at least surprising, relative to the expectations of said hypothesis, as it in fact was. (When I say that, I'm glossing over the difference between 1- and 2-tailed tests; in 1-tailed tests, the p-value is the probability that the data would be at least this surprising, in the direction in which it is surprising.)

In this example, the null hypothesis is that the distributions are the same, so $p$ is already the p-value for that hypothesis. The only event that we know has probability $1-p$ is that the data would be less "surprising", again conditional on the null hypothesis. We certainly can't do another test in which the role of null hypothesis switches to the opposite of what it was before; "the distributions differ" doesn't allow us to calculate p-values.

I think that answers your second question. As for the first, the reason we talk about "failing to reject" the null hypothesis is because you can't prove it, only disprove it or be impressed it survived the effort. As for what you can do in this example, I suggest you double-check a p-value of 1. Such a p-value means the data is as consistent with the distributions being the same as it could possibly get. With data drawn from a continuous distribution, this is suspicious.

score 2 · Answer 2 · answered Aug 21 '20 at 15:20

If I'm understanding your questions correctly it seems that you have one distribution which is given by a formula (theoretical) and another that is given by data (experimental). Thus only one of your distributions is given by a sample, the experimental one. Thus you should be using the $\textit{one sample}$ K-S test. This test is designed for what you have in mind (ie determining if the underlying theoretical distribution for the experimental distribution is the theoretical one that you have).

The two sample test is for determining whether two experimental distributions have the same underlying theoretical distribution.

Now on to $p$-values. I don't like the whole "null-hypothesis" language, as I think it's overly confusing. The thing to get used to in statistics is that there is no absolute notion of true vs. false when it comes to experimental data. It's all about degrees of confidence.

So if we take for example the case of flipping a fair coins. The theoretical distribution is a discrete distribution with heads and tails each having probability $\frac{1}{2}$. If I were to flip a coin and get 100 heads in a row what does that mean? Do it mean that my coin isn't fair?

No, it only means that it is very unlikely to be fair. I suggest you try to work out the K-S test (one sample) for this example, it is very illuminating.

Finally, if I told you to make a decision on whether the coin was fair in this case you would probably say no. This is what the $p$-value is. It is a quantifiable number that says what level of confidence you need to have before you're going to make conclusions based on the data. There is no set in stone preferred value, it just depends on the application.

What is difference in the data when it is from experiment and theory? Both are matrices with two columns, the variable value X and it's corresponding probability value P(X). — cat's eye, Aug 24 '20 at 08:43

score 0 · Answer 3 · answered Aug 21 '20 at 15:35

You haven't told us much. In particular what sample size you used. If you make a quick superficial search for something and you don't find it, that doesn't prove it's not there. Likewise, if a very small sample is used and your test fails to reject the null hypothesis, that doesn't mean the null hypothesis is true; it just means you haven't looked very far for whatever evidence against it may be there.

Testing goodness of fit using Kolmogorov-Smirnov test

3 Answers3