0

I am taking some samples from multinomially identically distributed random variables $X^i$ :

$X^i = (X^i_1, ..., X^i_m) \sim Multinomial(n, p_1,...,p_m)$

Denote the vector: $X^* = \sum_i X^i$, i.e. a sample from $X^*$ is the sum of i.i.d. multinomial samples.

Now, how would the PDF/CDF of the distribution of $X^*$ would look like? Is it possible to express it in a closed form?

An example of what I would like to compute with that is as follows: Denote the event $A = X^*_2 \ge 1 \land X^*_3 \ge 1 \land ... \land X^*_m\ge1$.

  1. What is $P(A)$ for a given $n$?
  2. What is the smallest $n$ such that $P(A) > \alpha$?

If that makes it any easier, it would also be interesting to know the answer to a special case when $p_2 = p_3 = ... = p_m$.

I am thinking that perhaps the distribution of $X^*$ itself is also multinomial, but I am not sure how to prove that.

  • 1
    A sum of iid multinomials is again a multinomial. Here with parameters $kn,p_1,\dots,p_m$ where $k$ denotes the number of terms. Just like a sum of iid binomials. See here for that. – drhab Feb 11 '20 at 13:41

1 Answers1

1

Start with a sequence of iid random vectors $(B_r)_r$ where $B_r=(B_{r,1},\dots,B_{r,m})$ with $P(B_{r,j}=1)=p_j$ for every $r\in\mathbb N$ and every $j\in\{1,\dots,m\}$.

This under the condition that the $p_j$ are positive and satisfy $\sum_{j=1}^mp_j=1$.

Then for every non-empty finite $R\subseteq\mathbb N$ the random vector $X_R:=\sum_{r\in R}B_r$ has multinomial distribution with parameters $|R|,p_1,\dots,p_m$.

Moreover if $S$ is another non-empty finite subset of $\mathbb N$ and $R\cap S=\varnothing$ then $X_R$ and $X_S$ are independent.

So taking $R_i=\{ni-n+1,\cdots,ni\}$ for $i=1,\dots,k$ we get $k$ iid random vectors $X_{R_i}$ with multinomial distribution and parameters $n,p_1,\dots,p_m$.

Here $X_{R_i}$ can be identified with the random variable that you denoted by $X^i$.

Taking $R=\bigcup_{i=1}^kR_i=\{1,2,\dots,kn\}$ we have random vector $X_R$ also with multinomial distribution, this time with parameters $kn,p_1,\dots,p_n$.

This with $X_R=\sum_{i=1}^kX_{R_i}$ so this proves that the sum of iid random variables with multinomial distribution is again multinomial.

drhab
  • 153,781