3

Suppose we have a sample $X_1,X_2,...,X_n \sim F$, where the distribution $F$ is unknown. Let $T_n = g(X_1,X_2,...,X_n) = \bar{X}^2$, $\mu = \mathbb{E}[X_1]$, and define the following: $$\alpha_k = \int \left | x - \mu \right | ^k dF(x) \ \ \ \ \text{and} \ \ \ \ \hat{\alpha}_k = \frac{1}{n}\sum_{i=1}^{n}\left | X_i - \bar{X_n} \right |^k.$$ We can see that $\hat{\alpha}_k$ is the plug-in estimator for $\alpha_k$. Now suppose we take $B$ bootstrap samples $X_1^*, X_2^* , ..., X_n^*$ and compute $T_n^*$. I want to show that the variance of $T_n^*$, that is, the variance of our bootstrap estimate, is $$v_{\text{bootstrap}}(T_n^*) = \frac{4\bar{X_n}^2 \hat{\alpha}_2}{n} + \frac{4\bar{X_n} \hat{\alpha}_3}{n^2}+\frac{\hat{\alpha}_4}{n^3}.$$ In general, the variance of a bootstrap estimator $S_n^*$ with $B$ bootstrap samples is $$v_{\text{bootstrap}} = \frac{1}{B}\sum_{b=1}^{B}\left ( S_{n,b}^* - \frac{1}{B}\sum_{r=1}^{n}S_{n,r}^* \right )^2,$$ where $S_{n,b}^*$ is the statistic computed from the $b^{\text{th}}$ bootstrap sample. If I use this definition and apply it to the original problem, I obtain something like

\begin{equation} \begin{split} v_{\text{bootstrap}} &= \frac{1}{B}\sum_{b=1}^{B}\left ( \bar{X}_{n,b}^{*2} - \frac{1}{B}\sum_{r=1}^{n}\bar{X}_{n,r}^{*2} \right )^2 \\ &= \frac{1}{B}\sum_{b=1}^{B}\left [ \bar{X}_{n,b}^{*4} - 2\bar{X}_{n,b}^{*2}\frac{1}{B}\sum_{r=1}^{B}\bar{X}_{n,r}^{*2} + \frac{1}{B^2}\sum_{r=1}^{B}\bar{X}_{n,r}^{*2}\right ] .\\ \end{split} \end{equation} From here, I don't see anything I can do to get a nicer form. So although this should simplify to $v_{\text{bootstrap}}(T_n^*),$ I am thinking this is probably not the best approach. Another approach that I thought of was conditioning. We have $$\mathrm{Var}[\bar{X}^{*2}] = \mathbb{E}[\mathrm{Var}[\bar{X}^{*2} | X_1, X_2, ... , X_n]] +\mathrm{Var}[\mathbb{E}[\bar{X}^{*2} | X_1, X_2, ... , X_n]].$$ This seems more computable. The bootstrap distribution is as follows. \begin{array}{|c|c|c|c|} \hline x & \mathbb{P} (X^* = x )\\ \hline X_1 & 1/n \\ \hline X_2 & 1/n \\ \hline \vdots & \vdots \\ \hline X_n & 1/n \\ \hline \end{array} From this, the expected value and variance of $\bar{X}^*$ come easily, but the squared term in $\bar{X}^{*2}$ is what is tripping me up.

Does anyone have any ideas or possible solutions? Thanks.

0 Answers0