2

We have $n$ machines that form a system. The system works if $k$ or more of the $n$ machines are running. The machines are hosted in pods. They are evenly distributed across $l$ of the pods (best effort). For example, if there are $3$ pods and $7$ machines, one pod will have $3$ machines and the other two will have $2$ each.

If a pod is down, all machines hosted on it will go down as well. The probability that any machine is running at a given time is $p$ and the probability that any of the pods will be running is $q$. The machines and pods function independently of each other.

What is the probability that the system will be running at any given time (or that $k$ or more of the machines are functional)? Let's call this $P_{l,n,k}$.


I have a solution that involves simply listing out all the possibilities but it's highly inefficient for large $l$ and not very pretty either.

I also tried creating a recurrence relation for $P_{l,n,k}$ but haven't gotten very far.


One observation: some pods will have $\left[\frac{n}{l}\right]+1$ machines (where $[x]$ is the greatest integer less than or equal to $x$) and some will have $\left[\frac{n}{l}\right]$. Let's call the former "heroes" and the latter "joes". Perhaps doing some combinatorics over heroes and joes will provide an elegant, efficient solution?


Some special cases:

If $l\geq n$, it becomes a $k$ of $n$ system with reliability: $pq$. The probability in this case becomes:

$$\sum_{j=k}^{n} {n \choose k}(pq)^j{(1-pq)^{n-j}}$$

If $l=1$, we need the pod to be up. And beyond that, it's a $k$ of $n$ system with component reliability, $p$. So the probability in this case becomes:

$$q\sum_{j=k}^{n} {n \choose k}(p)^j{(1-p)^{n-j}}$$

Rohit Pandey
  • 7,547

2 Answers2

1

I'm afraid there is no really elegant solution, but here we go nonetheless. Let $n,l\in \mathbb{N}$ and let $n_i\in \{\lfloor \frac nl \rfloor, \lfloor \frac nl \rfloor + 1\}$ be the number of machines in pod $i$ for $i = 1,\ldots l$. Now let $P_i$ be the random variable that denotes the number of working machines in pod $i$ (where I mean that a machine works if neither it nor its' pod have failed). By your independence assumptions, the $P_i$ are independent and the quantity of interest is $\sum_{i=1}^l P_i$.

Now let us first look at a single pod. Denote by $p_{m,k}$ the probability that a pod containing $m$ machines has exactly $k$ working machines. Then $$p_{m,k} = \begin{cases} q \binom{m}{k} p^k (1-p)^{m-k},& k > 0,\\ 1-q + q(1-p)^m,& k = 0,\end{cases},\quad m\in \mathbb{N}_0,\, k\in \{0,\ldots,m\}.$$ Then we have $\mathbb{P}\left(P_i = k\right) = p_{n_i, k}$. Now by independence of the $P_i$ we conclude $$\begin{align*} \mathbb{P}\left(\text{$k$ machines work in total}\right) &= \mathbb{P}\left(\sum_{i=1}^l P_i = k\right)\\ &= \sum_{\substack{k_1+\ldots +k_l = k\\ 0\le k_i \le n_i}}\,\,\mathbb{P}\left(P_1 = k_1,\ldots P_l = k_l\right)\\ &= \sum_{\substack{k_1+\ldots +k_l = k\\ 0\le k_i \le n_i}} \,\,\prod_{i=1}^l p_{n_i, k_i} \end{align*}$$ To get $P_{n,l,k}$, you would have to sum this expression from $k$ to $n$. This might not be very helpful and maybe it's precisely what you found yourself, but I am unsure whether, without resorting to trivial special cases such as $l\ge n$ or $p,q\in \{0,1\}$, this expression can be simplified much further, simply due to the unhandy nature of the sum.


One more derivation one could do, however, would be to ask about the probability that at least one machine works, or, equivalently, that none does. We obtain $$\begin{align*} P_0 := \mathbb{P}\left(\text{no machine works}\right) &= \prod_{i=1}^l p_{n_i, 0}\\ &= \prod_{i=1}^l \left(1-q + q(1-p)^{n_i}\right)\\ &\stackrel{(\star)}{=} \left(1-q + q(1-p)^{n/l}\right)^l, \end{align*}$$ where the last equality holds only if $l$ divides $n$ (for large $n$, this should not make much of a difference though). We can now give a lower bound on this (which becomes increasingly exact as $n/l \to \infty$) by $$P_0 \ge (1-q)^l \left[ 1 + l \frac{q}{1-q} (1-p)^{n/l}\right].$$

0

I came up with an efficient solution. If the machines are distributed among $l$ pods as evenly as possible, the pods will be divided into two sets. The pods in the first set will each have one more machine assigned to them than the ones in the second set. Let's call the first set of pods "hero pods" and the second set, "joe pods". It's easy to see that the number of machines assigned to each hero pod will be:

$$h = n-l \left\lfloor \frac{n}{l} \right\rfloor$$ and the number of machines assigned to each joe pod will be: $$j = l \left\lceil \frac{n}{l} \right\rceil-n$$

Next, we can simply condition on all possibilities of the number of hero and joe pods that are working. Within each of those conditions, we're left with a simple k-of-n system. So, the probability that the overall system is UP becomes:

$$P = \sum\limits_{\eta=0}^{h}\sum\limits_{\nu=0}^{l} P(H=\eta)P(J=\nu) P(X_{\eta \left\lceil \frac{n}{l} \right\rceil + \nu \left\lfloor \frac{n}{l} \right\rfloor,p}\geq k)$$

Here, $X_{n,p}$ is a Binomial random variable with parameters $n$ and $p$, $H$ is the random variable representing the number of hero pods that work (binomial with parameters $h$ and $q$) and $J$ represents the number of joe pods (binomial with parameters $j$ and $q$).

Rohit Pandey
  • 7,547