4

The normalizing constant of the Dirichlet distribution implies that $$ \int_{\Delta_k} x_1^{n_1}\dots x_k^{n_k} dx_1 \dots dx_k = \frac{n_1! \dots n_k!}{(n_1+\dots+n_k+k-1)!} $$ where $$ \Delta_k = \{(x_1, \dots, x_k) \in [0,1]^k: x_1+\dots+x_k=1\} $$ for $n_1,\dots,n_k \in \mathbb N$. Is there a combinatorial proof for this?

Mike Earnest
  • 84,902

1 Answers1

4

I can give what I would call a probabilistic proof of your equality. Half of my proof is combinatorial, so I think you might consider this satisfying.


Let $U_1,\dots,U_{k-1}$ be i.i.d. $\text{Unif}[0,1]$ random variables. These can be used to simulate the Dirichlet distribution as follows. The $k-1$ random points divide the unit interval $[0,1]$ into $k$ pieces. If we let $(X_1,\dots,X_k)$ be the vector of the lengths of these pieces, numbered from left to right, then $(X_1,\dots,X_k)$ follows a Dirichlet distribution. To be precise, if we let $U_{(i)}$ denote the $i^\text{th}$ smallest value of the list $[U_1,\dots,U_{k-1}]$, with the convention that $U_{(0)}=0$ and $U_{(k)}=1$, then $X_i=U_{(i)}-U_{(i-1)}$ for each $i\in \{1,\dots,k\}$.

Now, on the same probability space as $U_1,\dots,U_{k-1}$, let $$ V_{i,j},\qquad 1\le i\le k,\quad 1\le j\le n_i $$ be a collection of $n_1+\dots+n_k$ i.i.d. $\text{Unif}[0,1]$ random variables, also independent of $U_1,\dots,U_k$. Furthermore, let $E$ be following event:

$E$ occurs if and only if, for each $i\in \{1,\dots,k\}$ and each $j\in \{1,\dots,n_i\}$, we have $$U_{(i-1)}\le V_{i,j}\le U_{(i)}.$$ In other words, when $[0,1]$ is broken into pieces by $U_1,\dots,U_{k-1}$, then $E$ is the event that $V_{i,j}$ lands in the $i^\text{th}$ piece from the left for all $i,j$.

We will compute the $P(E)$ in two ways, and therefore deduce your equation.

  1. First, we compute $P(E)$ by reasoning combinatorially. There are $(n_1+\dots+n_k+k-1)!$ possible relative orderings of the random variables $\{U_i\}_{i=1}^{k-1}\cup \{V_{i,j}\}_{1\le i\le k,1\le j\le n_i}$, all of which are equally likely. Of these orderings, there are precisely $n_1!\cdots n_k! (k-1)!$ orderings for which $E$ occurs. This is because an ordering where $E$ occurs is uniquely specified by choosing the one of the $n_i!$ relative orderings of $\{V_j\}_{j=1}^{n_i}$ for each $i$, and also one of the $(k-1)!$ relative orderings for $U_1,\dots,U_{k-1}$. We conclude that $$ P(E)=\frac{n_1!\cdots n_k!(k-1)!}{(n_1+\dots+n_k+k-1)!}\tag1 $$

  2. On the other hand, letting $\mu_k$ be the Dirichlet distribution, we can apply the law of total probability to compute $$ P(E)=\int_{\Delta_k} P(E\mid X_1=x_1,\dots,X_k=x_k)\,d\mu_k(x_1,\dots,x_k)\tag2 $$ In other words, for a fixed realization of $U_1,\dots,U_{k-1}$ producing intervals with lenghts $(x_1,\dots,x_k)$, we ask what the probability of $E$ occurring is, and then we average over all possible instances of $U_1,\dots,U_{k-1}$. I claim that $$ P(E\mid X_1=x_1,\dots,X_k=x_k)=x_1^{n_1}\cdots x_k^{n_k}\tag3 $$ This is because we are conditioning on the $i^\text{th}$ interval from the left having a length of $x_i$, and we need all of the variables $\{V_{i,j}\}_{j=1}^{n_i}$ to lie in this interval. The probability a uniform r.v. lands in an interval of length $x_i$ is just $x_i$, and the probability that this happens for $n_i$ different uniforms is $x_i^{n_i}$.
    As you noted, the normalization constant for the Dirichlet distribution is $(k-1)!$. Therefore, $d\mu_k={(k-1)!} dx_1\dots dx_k$. Combining this with $(2)$ and $(3)$ proves $$ P(E)=\int_{\Delta_k} x_1^{n_1}\cdots x_k^{n_k}(k-1)!{dx_1\dots dx_k}\tag4 $$

Finally, combining $(1)$ and $(4)$ proves your equation.

Mike Earnest
  • 84,902