Let $x_i=\mathbb E[X^i]$ and $y_i=\mathbb E[Y^i]$. We have that
$$\mathbb E[(X+Y)^i]=\sum_h\binom ih x_{i-h}y_h$$
and define the Hurwitz product $*:\mathbb R^n\times\mathbb R^n\to\mathbb R^n$ by
$$x*y=\left(\sum_h\binom ih x_{i-h}y_h\right)_{i\in [n]}$$
so that the additivity condition is implied by $p_n(x*y)=p_n(x)+p_n(y)$ and the homogeneity condition by $p_n(cx_1,\dots,c^nx_n)=c^np(x_1,\dots,x_n)$, which must hold for all vectors $x,y$ whose entries are the moments of a random variable. The existence theorem for the moment problem says that there exists such a random variable for the vector $(m_i)_i$ if and only if the Hankel matrices $(H_n)_{1\leq i,j\leq n}=m_{i+j}$ are positive semi-definite. The moments of an exponential distribution with parameter $\lambda=1$ are $m_i=i!$, so the associated Hankel matrices have strictly positive determinant. By continuity of the determinant there's a ball around $v=(1!, \dots, n!)$ such that all of these determinants are positive, so by Sylvester's criterion the associated Hankel matrices are positive semi-definite, so there exists a random variables with these moments. Therefore the polynomials $p_n(x*y)$ and $p_n(x)+p_n(y)$ coincide in an open ball $B_\epsilon (v,v)\subset\mathbb R^{2n}$, so they're equal.
Let $\lambda=p_n(0,\dots,0,1)$ and I'll prove that this implies $p_n=\lambda\overline B_n$ where $\overline B_n$ is a logarithmic Bell polynomial, defined by the relationship of exponential generating functions
$$\log\sum_n\frac{x_n z^n}{n!}=\sum_n\frac{\overline B_n(x_1,\dots,x_n)z^n}{n!}.$$
These polynomials give the cumulants as a function of the moments, but they're rarely given names in the literature, and they satisfy
$$\overline B_n(x_1,\dots,x_n)=\sum_{k=1}^n (-1)^{k-1}(k-1)!B_{n,k}(x_1, \dots, x_{n-k+1}).$$
the operation $*$ gives $\mathbb R^n$ the structure of an abelian Lie group, and it is isomorphic to the usual euclidean space via
\begin{align*}
\varphi:\mathbb R^n&\to \mathbb R^n\\
(x_1, \dots,x_n)&\mapsto\left(h![t^h]\prod_{k=1}^n (1+t^k)^{x_k}\right)_{h\in [n]}.
\end{align*}
I.e. $\varphi(x+y)=\varphi(x)*\varphi(y)$. The composition $\overline B_n\circ\varphi$ is linear: let $y=\varphi(x)$ and we can compute
\begin{align*}
\sum_h \frac{y_h}{h!}z^h&=\prod_{k}(1+z^k)^{x_k}+O(z^{n+1})\\
\log\sum_h \frac{y_h}{h!}z^h&=\sum_k x_k\log(1+z^k)+O(z^{n+1})\\
\sum_h \frac{\overline B_h(y)}{h!}z^h&=\sum_k x_k\log(1+z^k)+O(z^{n+1})\\
\overline B_n(y)=\overline B_n\varphi(x)&=n![z^n]\sum_k x_k\log(1+z^k)\\
&=n!\sum_{d|n}x_d[z^n]\log(1+z^d)\\
&=(n-1)!\sum_{d|n}(-1)^{n/d+1}d x_d.
\end{align*}
Call this linear functional $L=\overline B_n\circ\varphi$.
By hypothesis, we have that
$$p_n\circ\varphi(x)+p_n\circ\varphi(y)=p_n(\varphi(x)*\varphi(y))
=p_n\circ\varphi(x+y).$$
So $p_n\circ\varphi$ is a continuous function which preserves addition, so it must be linear. We can find out what it is by computing its differential at the origin. By the homogeneity condition, $p_n$ is a linear combination of the monomials $x_1^{i_1}\dots x_n^{i_n}$ where $i_1+2i_2+\dots ni_n=n$. Only one of these has degree one and using the normalization we have $p_n(x)=\lambda x_n+O(|x|^2)$ near the origin which implies $(dp_n)_0=[0,\dots,0,\lambda]$. To compute the differential (Jacobian) of $\varphi$ we consider the generating function that defines it, and compute
\begin{align*}
\frac{\partial }{\partial x_i}\prod_{k=1}^n (1+t^k)^{x_k}&=\log(1+t^i)\prod_{k=1}^n (1+t^k)^{x_k}\\
\frac{\partial }{\partial x_i}\bigg\rvert_{x=0}\prod_{k=1}^n (1+t^k)^{x_k}&=\log(1+t^i).
\end{align*}
When we multiply by $(dp_n)_0$ all but the last column are going to vanish, so we compute
\begin{align*}
\frac{d\varphi_n}{dx_i}\bigg\rvert_{x=0}&=n![t^n]\frac{\partial }{\partial x_i}\bigg\rvert_{x=0}\prod_{k=1}^n(1+t^k)^{x_k}\\
&=n![t^n]\log(1+t^i)\\
&=\begin{cases}
n!\frac{(-1)^{n/i+!}}{n/i} & i|n\\
0 & i\nmid n
\end{cases}
\end{align*}
which implies that $(dp_n\circ\varphi)_0=(d\varphi)_0\circ (dp_n)_0=\lambda L$.
Thus $p_n\circ\varphi=\lambda L=\lambda\overline B_n\circ\varphi$, and since $\varphi$ is a bijection we can apply $\varphi^{-1}$ on the right and obtain $p_n=\lambda\overline B_n$.
We can avoid using the result that $\varphi$ is a bijection by noticing that $(d\varphi)_0$ is a triangular matrix whose diagonal entries are not zero, so it is invertible. Thus $\varphi$ is invertible in a neighborhood of the origin, thus $p_n-\lambda\overline B_n$ is a polynomial vanishing in a small ball centered at the origin, so it is zero.
In "Twelve problems in probability no one likes to bring up", Rota credits the solution of this problem to Thiele:
Sometime in the last [19th] century, the Danish statistician Thiele observed that thevariance of a random variable, namely $\text{Var}(X)=\mathbb E[X^2]-\mathbb E[X]^2$, possesses notable
properties, to wit:
- It is invariant under translation: $\text{Var}(X+c)=\text{Var}(X)$ for any constant $c$.
- If $X$ and $Y$ are independent random variables, then $\text{Var}(X + Y) = \text{Var}(X) + \text{Var}(Y)$.
- $\text{Var}(X)$ is a polynomial in the moments of the random variable $X$.
He then proceeded to determine all nonlinear functionals of a random variable which have the same properties.
I wasn't able to find the original reference from Thiele.