18

Forgive my ignorance, I am brand new to Poisson and statistics in general. $$ \bbox[5px,background:black]{\color{white}{\begin{array}{l} \text{Poisson Distribution}\\ P(X=k)=\frac{\lambda^ke^{-\lambda}}{k!}\\ k\text{ is the given number of event occurrences}\\ \lambda\text{ is the average rate of event occurrences} \end{array}}} $$ original image

My statistics class has this formula for figuring out the probability of a less-common event happening given the average rate of occurance.

The example is "An intersection has, on average, 15.5 accidents per week. Using the Poisson Distribution formula, determine the probability of the intersection having only 1 accident in a week."

Ok, easy, just plug and chug and you get your answer.

Here is my actual question: How could a formula possible answer this? What if the average was 15.5 but always ranges from 10 to 20 per week? What if 100 years went by and every week there was somewhere between 10 and 20 accidents, givin us an an average of 15.5 but NEVER having just a single accident? Or what if the average was 15.5 because it is always either 15 or 16 per week? The point being, there may be a near-zero probability of 1 or 14 (or any value of 'k') happening, yet the formula just guesses its little heart out having no clue that it's basically a 0% chance (and yes, I know that probabilities are all based on information we have and don't have, I get that, just a way for us to make a best guess using the information we have.)

This is the most simple example I can come up with to explain why I am confused about how this could work. It seems like to get anything even slightly meaningful you would have to have at least some information regarding the max and min or something to give you at least some sort of range, right?

Thank you in advanced! I am decent at math but I am not a mathematician or anything like that. Just a CS student with a lot more math to learn haha. But I am very curious, and this is bothering me.

robjohn
  • 353,833
Ryan
  • 291
  • 4
    Have you studied binomial distribution before this? Did that make sense to you? – Nothing special Sep 02 '23 at 20:30
  • Only a small amount of study. It was the lecture like a day or two ago but this school does not go very far in depth on topics. What small amount we studied, though, seemed very clear to me. – Ryan Sep 02 '23 at 20:32
  • 8
    You are correct: if all you know is the average, you can't know that it is a Poisson process. There are many other possibilities. Poisson is very good analytically and, if the event is truly rare, subtleties in the statistics may not matter much. – lulu Sep 02 '23 at 20:33
  • A formula alone predicts nothing. You need to know its parameters. In fact: Poisson describes rare events and it is not easy to find the correct $\lambda$ in a real life situation. – Kurt G. Sep 02 '23 at 20:34
  • As a practical matter, one ought to have real data. Given that, you can test to see if your model fits the observed results. – lulu Sep 02 '23 at 20:34
  • 5
    The Poisson distribution (or process) makes very specific assumptions (independence of arrivals in disjoint time intervals, negligible chance of more than $1$ arrival in sufficiently small time interval, etc). Outside of those assumptions, the distribution may not be as good of a model, just like how the exponential distribution would be a poor model for anything that can be positive and negative. Whether a distribution is a good model for data is a statistical question that requires estimation, hypothesis testing. First we must get familiar with distributions and the chances they give. – Nap D. Lover Sep 02 '23 at 20:37
  • @NapD.Lover and lulu, Ok that makes sense (bit above my level but I think I get it). So is my main problem here more due to the oversimplification of the problem that the school gave us? They literally just give the average accidents and desired accidents. Would a more specific problem with more information make this less of an issue? What I am getting out of these answers is that, with the limited data given in the question, we have no clue whether or not a Poisson distribution even applies here. – Ryan Sep 02 '23 at 20:42
  • Oddly, more information tends to make things worse. Real world data is messy and noisy and seldom falls neatly in line with any of the analytically pleasant distributions you are studying. The theoretical distributions are much nicer and easier to work with. For your class, you should be told what assumptions to make in which contexts. Even so, be sure to clear to state your assumptions. – lulu Sep 02 '23 at 20:50
  • @lulu I suppose I should be glad they gave me such a simple problem. But, it did little to help me understand the material. The lesson says nothing about any assumptions (which is what bothered me to begin with haha). So, for the example question I gave, how would you communicate the assumptions? What assumptions do we make here? With the limited info in the question, '1 accident' could be a monthly occurrence, or something that happens once every hundred million years, right? No limits? How can we word assumptions to limit this range to something reasonable? Or am I overthinking this lol. – Ryan Sep 02 '23 at 20:56
  • 8
    I'd say, the "assumption" is that you are to model low frequency events with a Poisson distribution. Granted, they should be clearer about the fact that this is an assumption, not some kind of theorem or physical inevitability. – lulu Sep 02 '23 at 20:58
  • 7
    +1 for a good question. I don't understand the two close votes. – Ethan Bolker Sep 02 '23 at 21:03
  • 1
    People are voting against my question lol? Well, I am learning a lot from your answers, and I think I am close to understanding the concept now, so if this platform is here to help people learn and understand mathematical concepts then I would say it was a success. – Ryan Sep 02 '23 at 21:09
  • 1
    Who the hell keeps using this intersection that averages more than 2 accidents a day. – Cliff AB Sep 03 '23 at 18:12
  • 1
    I converted the image to $\LaTeX$. I hope you don't mind. – robjohn Sep 03 '23 at 22:51
  • @Ryan fwiw the probability that you get 1,2, or 3 accidents is essentially 0, ie close to 'never' . See here for a visualization of that pdf https://homepage.divms.uiowa.edu/~mbognar/applets/pois.html – Georg M. Goerg Sep 04 '23 at 13:51
  • You are right that the actual accident distribution may not correspond to the Poisson distribution. But that is what you are instructed to use; presumably, those of higher paygrade have determined that Poisson is a good fit to the data. It's not for you to question. – richard1941 Sep 07 '23 at 01:23

2 Answers2

19

Good question! We need to make a modeling assumption here, namely that accidents occur independently of each other; that is, we assume that in some small interval of time $\Delta t$ (say an hour, for the purposes of this problem) there's some small probability $p$ that an accident will happen, and whether an accident occurs in any given small interval is independent of any other.

With this assumption (which is very strong), the number of accidents that occur in a large interval of time is a binomial distribution. Specifically, if we fix the average number $\lambda$ of accidents and divide the week into $N$ small time intervals where the probability that an accident occurs is $\frac{\lambda}{N}$ (this is necessary for the average to be $\lambda$), then the distribution of accidents is roughly $\text{Bin}\left( N, \frac{\lambda}{N} \right)$.

Now the interesting mathematical fact is that as $N \to \infty$ this binomial distribution converges (quite rapidly) to a Poisson distribution $\text{Pois}(\lambda)$; this is the simplest special case of the Poisson limit theorem. In terms of the probability that there will be $k$ accidents this says that

$$\lim_{N \to \infty} {N \choose k} \left( \frac{\lambda}{N} \right)^k \left( 1 - \frac{\lambda}{N} \right)^{N-k} = \frac{\lambda^k}{k!} e^{-\lambda}$$

which you can check by writing ${N \choose k} = \frac{N(N-1) \dots (N-k+1)}{k!}$ and using the fact that $\lim_{N \to \infty} \left( 1 + \frac{x}{N} \right)^N = e^x$. But this limit calculation doesn't really show you how fast the convergence is; you can check that things are already quite close for, say, $N = 168$ (the number of hours in a week).

Qiaochu Yuan
  • 468,795
  • 1
    Ok, thank you! Also a bit above my level. Haven't taken calc yet (my next class), but I do understand the basic idea of limits and convergence to a limit. Binomial distribution made sense to me yesterday (the VERY basic level we went into, unfortunately), so I can understand what you mean about dividing our average lambda down into binomial sections to get the probability of each individual accident happening during a given time. I think this is getting me in the correct direction, but I will need to ponder all these answers a bit more until it clicks. – Ryan Sep 02 '23 at 20:47
  • 3
    @Ryan: it's worth taking your time, this is a lot to wrap your head around. In any case to more directly answer your question "why couldn't there always be between $10$ and $20$ accidents," the answer is that this isn't compatible with assuming independence. If you have $20$ accidents what's to stop you from having $21$ accidents? (Here we need to assume that enough time has passed that the accident has been dealt with, so the intersection is just normal afterwards, and there are new cars coming in all the time. So the short time interval $\Delta t$ can't be too short here.) – Qiaochu Yuan Sep 02 '23 at 20:58
  • 6
    Ok, I think this gave me a small, but new insight; me saying "it's ALWAYS 10 to 20, but NEVER 1" is messing with the independence, specifically because each individual occurrence (or trial) MUST have the same probability (at the individual level), so just me saying "10 to 20" doesn't make sense, because that would imply that there is some outside force that is stopping it from being 1-9 or 20-30, for example... Something was forcing 10 to 20 to have higher odds of happening, destroying the independence requirement for it to be a binomial distribution. Am I on the right track here? – Ryan Sep 02 '23 at 21:05
  • 4
    Yep, that's the idea. – Qiaochu Yuan Sep 02 '23 at 22:03
  • 1
    Thank you for explaining that! Everyone's words were helpful, but that last comment you made really made me realize what I was missing. – Ryan Sep 02 '23 at 22:07
  • @QiaochuYuan great explanation. I m wondering though why do you say that assuming independence here is a "very strong" assumption? In general, maybe, but for this particular problem of traffic accidents on an intersection that seems like a very reasonable/realistic assumption. (Only thing I can think of now is that drivers might get distracted by accident, and themselves have one ... Not sure how much more often that really happens compared to independence). Did you have other concrete reasons in mind for why they would be dependent? – Georg M. Goerg Sep 04 '23 at 13:40
  • @GeorgM.Goerg Independence is a strong assumption precisely for the fact that it's likely not true for real situation. For example, assuming that traffic accidents happen independently on an intersection in a real world can't really be true because: 1) A big accident can induce more accidents to happen simultaneously, or, on the other hand 2) An accident would cause the traffic around the area to jam, hence reduce the possibility of another accident happening for a short period of time (until the road is cleared). (to be continued) – BigbearZzz Sep 04 '23 at 19:16
  • (continued) However, over a really long period of time, say a year or many years, it's somewhat reasonable to assume that traffic accidents happen independently, provided the rate of one happening is sufficiently low (certainly not 15 accidents a day...) since an accident on Monday is likely not to affect the occurrence of another accident on Friday. A good, real-world phenomenon that fits extremely well with the assumption of independence is radioactive decay of isotopes since a single gram of matter consists of something like $10^{24}$ nuclei that don't interact, more or less. – BigbearZzz Sep 04 '23 at 19:21
  • @BigbearZzz that's the example I pointed out above hat slightly (not strongly!) violates the independence assumption on a unit of time ~ hour (yes agreed that in the very moment -- seconds -- that the accident happens , it is not independent). It does not take years for traffic accidents to become (practically) independent of each other. A couple of hours seems realistic/reasonable for any auto-regressive dependency to be washed out and become iid again. We dont have to look for radio-active decay of isotopes to encounter (practically) realistic independent events in every day life :) – Georg M. Goerg Sep 04 '23 at 23:24
  • @BigbearZzz (cont') def on a weekly time scale any auto-regressive patterns of traffic accidents on a second/minute/hourly basis can be safely ignored when observing and making inference on weekly count data (all dependencies within the week are subsumed into the count X), when looking at it as an aggregate inference problem of any random (typical) week, rather than a time series forecasting problem (where of course seasonality/weather makes a difference week by week -- but then it's usually assumed to be conditionally independent) – Georg M. Goerg Sep 04 '23 at 23:41
2

@Qiaochu Yuan's answer is the correct answer to how we can make so many assumptions from a single parameter + assumption that the data follows a Poisson distribution. But I'd like to also elaborate on your comment:

How could a formula possible answer this? What if the average was 15.5 but always ranges from 10 to 20 per week?

The simple answer is that if the counts always followed between 10-20, it would not be a Poisson distribution by definition: positive probability will always be assigned to 0 for a finite value of $\lambda$.

More generally, a Poisson distribution is a very rigid assumption that, in general, doesn't often model real world phenomena very well. In practice, more flexible distributions, such as the negative binomial would be used which allow the variance to differ from the mean.

Cliff AB
  • 239
  • 3
    I agree with this answer, basically the "reason" is "it's convenient to make wildly unrealistic assumptions because the calculations become a lot easier and more definite that way". Regarding the more flexible models discussed in this answer, a kewyword to use in searches is "overdispersion" -- there's even a Wikipedia article about it: https://en.wikipedia.org/wiki/Overdispersion Estimation for such models actually turns out to be a surprisingly thorny problem, about which one can go on for a long time in detail. Anyway your intuition is correct that a Poisson model probably is unrealistic. – Chill2Macht Sep 03 '23 at 20:16
  • Radioactive decay is a Poisson distribution. One atom does not have any effect in the decay of another (except in nuclear weapons). We use it extensively in reliability modeling and fault coverage estimation for self test functions of military electronics. – richard1941 Sep 07 '23 at 01:32
  • @richard1941 yep there are some examples that follow a Poisson. Another classic is monthly counts of military fatalities from horse kicks in the early 1900s. But real world count data being well fit by a Poisson distribution is the exception, not the rule. – Cliff AB Sep 07 '23 at 01:42
  • Actually, any binomial distribution with low p is almost Poisson. Like the number of people playing in the local Nevada casinos who are smiling. (p<0.01) – richard1941 Sep 08 '23 at 20:41
  • @richard1941 Your example of smiling would not follow a Poisson model. The reason for this is that people usually smile in groups (i.e. craps dealer rolls a good roll, someone tells a good joke, etc.). This correlation between events would increase the variance compared to a true binomial and thus cause over dispersion. But more to the point, empirically count data rarely follows a true Poisson distribution. – Cliff AB Dec 23 '23 at 02:53