1

I'm learning the Poisson distribution and am trying to "backward" calculate lambda if given a random value (k) and cumulative density function value (p). My k value is rather large, over 200, and I'm not even sure if using the poisson is a good idea since it's just looking like a normal distribution at this point. Either way, I'm still interested in trying to figure out if it's possible to calculate lambda given k and p.

I tried to figure it out on my own but I have no clue where to start or how to accomplish this. Ultimately, I'd like to use scipy.stat poisson.

For example, if I have a value of k=400 and p=0.9, I would like to find the value of lambda which would end up being approximately 375.

from scipy.stats import poisson
k=400
poisson(mu=375).cdf(k)

If anybody knows how to solve this and can help me that would be much appreciated.

zipline86
  • 399
  • 1
  • 5
  • 13

1 Answers1

2

There is no direct way to do this. But the cdf for a fixed number k is a monotonic function of mu. This means that we can use binary search to approximate the parameter mu.

Given a lower bound mu_0, an upper bound mu_1, a tolerance eps, and k and p from your question, the following code approximates the desired parameter:

while mu_1 - mu_0 > eps:
    mu = (mu_1 + mu_0) / 2
if poisson.cdf(k, mu=mu) > p:
    mu_0 = mu
else:
    mu_1 = mu

mu = (mu_0 + mu_1) / 2

Finding initial bounds

It is also possibile to find starting bounds by going up and down until such bounds are found. This would lead to the following function:

def approx_mu(k, p, mu_0=None, eps=0.1):
    # Initial value
    if mu_0 is None:
        mu_0 = k * p
# Find upper bound for mu:
while poisson.cdf(k, mu=mu_0) > p:
    mu_0 = mu_0 * 2

# Find lower bound for mu:
while poisson.cdf(k, mu=mu_0) < p:
    mu_0 = mu_0 / 2

mu_1 = mu_0 * 2

# Binary search
while mu_1 - mu_0 > eps:
    # New boundary
    mu = (mu_1 + mu_0) / 2

    if poisson.cdf(k, mu=mu) > p:
        mu_0 = mu
    else:
        mu_1 = mu

return (mu_0 + mu_1) / 2

How fast is it?
import time
t0 = time.time()
for _ in range(1000):
    approx_mu(k=5000, p=0.9, eps=0.1)
t1 = time.time()
print(f"{t1-t0:.2f} ms")

leads to 1.40 ms on my machine.

Broele
  • 1,947
  • 1
  • 9
  • 16