9

This is a question that I've always wondered in statistics, but never had the guts to ask the professor. The professor would say that if the p-value is less than or equal to the level of significance (denoted by alpha) we reject the null hypothesis because the test statistic falls in the rejection region. When I first learned this, I did not understand why were comparing the p values to the alpha values. After all, the alpha values were brought in arbitrarily. What is the reason for comparing them to the alpha values and where do the alpha values of 0.05 and 0.10 come from? Why does the statement $ p_\text{value} \leq \alpha$ allow you to reject $H_0$?

StubbornAtom
  • 17,932

7 Answers7

5

Here's the idea: you have a hypothesis you want to test about a given population. How do you test it? You take data from a random sample, and then you determine how likely (this is the confidence level) it is that a population with that assumed hypothesis and an assumed distribution would produce such data. You decide: if this data has a probability less than, say $95$% of coming from this population, then you reject at this confidence level--so $95$% is your confidence level. How do you decide how likely it is for the data to come from a given population? You use a certain assumed distribution of the data, together with any parameters of the population that you may know.

A concrete example: You want to test the claim that the average adult male weight is $170 lbs$ . You know that adult weight is normally-distributed, with standard deviation, say, 10 pounds. You say: I will accept this hypothesis, if the sample data I get comes from this population with probability at least $95$% . How do you decide how likely the sample data is? You use the fact that the data is normally-distributed, with (population) standard deviation=$10$, and you assume the mean is $170$ . How do you determine how likely it is for the sample data to come from this population: the $z-$ value you get ( since this is a normally-distributed variable , and a table allows you to determine the probability.

So, say the average of the random sample of adult male weights is $188lbs$. Do you accept the claim that the population mean is $170$? . Well, the decision comes down to : how likely (how probable) is it that a normally-distributed variable with mean $170$ and standard deviation $10$ would produce a sample value of $188lb$? . Since you have the necessary values for the distribution, you can test how likely this value of $188$ is, in a population $N(170,10)$ by finding its $z-$ value. If this $z-$ -value is less than the critical value, then the value you obtain is less likely than your willing to accept. Otherwise, you accept.

user99680
  • 6,836
  • 16
  • 25
3

Think of it this way: in hypothesis testing, the question we always want to answer is "Is this phenomenon that I've measured really there, or is the data suggesting it just a coincidence?" Of course it's never possible to completely rule out coincidence; the best you can ever hope for is to say "This is probably not a coincidence, because the chance of something like this happening just by chance is less than ______."

When you choose a significance level, you're filling in the blank space in the previous sentence. You are deciding just how unlikely a coincidence needs to be before you are willing to decide that there is really something going on.

mweiss
  • 24,547
3

You can reject whatever you want. Sometimes you will be wrong to do so, and some other times you will be wrong when you fail to reject.

But if your aim is to make Type I errors (rejecting the null hypothesis when it is true) less than a certain proportion of times then you need something like an $\alpha$, and given that approach if you want to minimise Type II errors (failing to reject the null hypothesis when it is false) then you need to reject when you have extreme values of the test statistic as shown by the $p$-value which are suggestive of the alternative hypothesis.

As you say, $0.05$ is an arbitrary number. It comes from RA Fisher, who initially thought that two standard deviations was a reasonable approach, then noted that for a two-sided test with a normal distribution this gave $\alpha \approx 0.0455$, and decided to round it to $0.05$.

Henry
  • 169,616
2

a small p-value means: assuming H0 is true, it is extremely hard to obtain the observed result from our sample, which means: 1) Null hypothesis H0 is false OR 2) Our sample was not drawn from null population. Either way, we reject H0. And we quantify ‘small’ with confidence level. That's why when p < alpha, we reject H0.

1

This is a great question. The bottom line is that .05 and .10 cutoffs are arbitrary and determined by custom and generally accepted practice. They're simply benchmarks that show when we can have confidence in rejecting the null hypothesis that the coefficient or test statistic is not actually zero, i.e. there most probably IS an effect or difference, or whatever you're testing. They are arbitrary in the sense that the cutoffs don't have any natural meaning. Treating a p-value of .49 as "significant," while a p-value of .51 is "insignificant" is silly. There's no magical transformation that happens at the .05 mark.

One of the top journals in my field has banished asterisks that designate significance and just reports standard errors.

Gina
  • 11
0

the p value that we calculate corresponds to the alternate hypothesis. The model that we create, that corresponds to calculation concerning alternate hypothesis. a p value of 0.05 mean we're okay with having this (alternate hypothesis catering) model to be wrong 5% of the time.

and therefore, each time p value exceeds 0.05, we deem the model as unfit* and accept the null hypothesis (i.e the assumed claim of the population)

and each time p falls deficit of 0.05, we go "okay, this model works for 95 or more % of the time, therefore we reject the assumed claim and give full merit to the alternative hypothesis we just made".

Hope this help ("

-1

Let’s consider two significant levels ($\alpha$=0.05 and $\alpha$=0.01) for p=0.03.

p=0.03: there is a 3% chance of observing an effect, solely due to the sampling error, when there is actually none.

$\alpha$=0.05: we are up to 5% confident that any observed effect is due to random sampling (or 95% confident that it is not!).

Since we only have 3% chance of observing the effect due to sampling error, we cannot be up to 5% confident that the observed effect is due to random sampling and support $H_0$ at this level. Therefore, we reject $H_0$ at 5% level.

Following the same logic, we are $\alpha$=0.01 confident that the observed effect is random and $H_0$ is true. Therefore, we do not reject $H_0$ at 1% level.

I hope the reader a better understanding of why $H_0$ is reject when $ p \leq \alpha$

I had the same question and this post too motivated me to write a short piece about it. The example above is taken from there

Shoresh
  • 99