We have $n$ machines that form a system. The system works if $k$ or more of the $n$ machines are running. The machines are hosted in pods. They are evenly distributed across $l$ of the pods (best effort). For example, if there are $3$ pods and $7$ machines, one pod will have $3$ machines and the other two will have $2$ each.
If a pod is down, all machines hosted on it will go down as well. The probability that any machine is running at a given time is $p$ and the probability that any of the pods will be running is $q$. The machines and pods function independently of each other.
What is the probability that the system will be running at any given time (or that $k$ or more of the machines are functional)? Let's call this $P_{l,n,k}$.
I have a solution that involves simply listing out all the possibilities but it's highly inefficient for large $l$ and not very pretty either.
I also tried creating a recurrence relation for $P_{l,n,k}$ but haven't gotten very far.
One observation: some pods will have $\left[\frac{n}{l}\right]+1$ machines (where $[x]$ is the greatest integer less than or equal to $x$) and some will have $\left[\frac{n}{l}\right]$. Let's call the former "heroes" and the latter "joes". Perhaps doing some combinatorics over heroes and joes will provide an elegant, efficient solution?
Some special cases:
If $l\geq n$, it becomes a $k$ of $n$ system with reliability: $pq$. The probability in this case becomes:
$$\sum_{j=k}^{n} {n \choose k}(pq)^j{(1-pq)^{n-j}}$$
If $l=1$, we need the pod to be up. And beyond that, it's a $k$ of $n$ system with component reliability, $p$. So the probability in this case becomes:
$$q\sum_{j=k}^{n} {n \choose k}(p)^j{(1-p)^{n-j}}$$