2

Let $X_1, X_2, \dots$ be iid (possibly heavy tailed) with their df $F$. Notation $X_{(k)}$ represents the $k-$th order statistic, i.e. $X_{(1)}=\min_{i\leq n} X_i$.

Let $k_n\in\mathbb{N}$ fulfill $$k_n\to\infty, \frac{k_n}{n}\to 0 \text{, as } n\to\infty.$$

Is the following true

$$F(X_{(n-k_n+1)})\overset{n\to\infty}{\to} 1^{-}?$$

If not, what "reasonable" conditions should we assume so it will hold?

More advanced question for dependent $X_i$ is also posted here $F(X_{(n-k_n)})\overset{n\to\infty}{\to} 1$ for time series?

Albert Paradek
  • 897
  • 1
  • 7
  • 19

1 Answers1

1

One short thing we should note. The actual ordered random variables depend on the number of i.i.d. elements we are sampling. I'll write $X^n_{(i)}$ to indicate the $i$th order statistic from $\{X_j\}_{j=1}^n$.

The proof follows from the strong law of large numbers. Let $p \in (0,1)$ be such that there exists an $x \in \mathbb{R}$ such that $F(x) = p$ (note that such a $p$ always exists unless $X_i$ is non-random in which case the result is obvious). Define the i.i.d. Bernoulli random variables $B_i = \mathbb{I}\{X_i\leq x\}$, so that each $B_i = 1$ with probability $p$. Then by the strong law of large numbers,

$$\lim_{n\to\infty} \frac{\sum_{i=1}^n B_i}{n} = \mathbb{E}[B_1] = F(x) = p \text{ a.s.}.$$

Let $m_n$ be the smallest index such that $X^n_{(m_n)} > x$, and if $X^n_{(n)} \leq x$ then we say $m_n = n+1$. Note that $m_n = 1 + \sum_{i=1}^n B_i$. Then,

$$\lim_{n\to\infty} \frac{m_n}{n} = \lim_{n\to\infty} \left(\frac{1}{n} + \frac{\sum_{i=1}^n B_i}{n}\right) = p \text{ a.s.}.$$

Note that,

$$\lim_{n\to\infty} \frac{n-k_n+1}{n} = \lim_{n \to\infty} 1 - \frac{k_n}{n} + \frac{1}{n} = 1.$$

Thus, there exists an almost surely finite random integer $N$ such that $n-k_n + 1 > m_n$ on the event $\{N\leq n\}$. Thus,

$$\lim_{n\to\infty} \mathbb{P}(F(X^n_{(n-k_n+1)})\geq p) = \lim_{n\to\infty} \mathbb{P}(X^n_{(n-k_n+1)} \geq x) = \lim_{n\to\infty} \mathbb{P}(N\leq n) = 1.$$

Now we have two cases to consider.

Case 1: There exists a sequence $p_i \to 1-$ such that for every $i$, there exists an $x_i$ such that $F(x_i) = p_i$.

In this case, our proof holds for all i. For any $\epsilon > 0$, let $i$ be sufficiently large that $1 - p_i < \epsilon$. Then,

$$\lim_{n\to\infty}\mathbb{P}\left(|1 - F(X^n_{(n-k_n+1)})| > \epsilon\right) \leq \lim_{n\to\infty}\mathbb{P}\left(F(X^n_{(n-k_n+1)}) < p_i\right) = 0.$$

Case 2: We cannot find such a sequence. This only happens if there exists an $\overline{x} \in \mathbb{R}\cup\{\pm\infty\}$ such that $\mathbb{P}(X_i \geq \overline{x}) = \mathbb{P}(X_i =\overline{x}) = q$ for some $q > 0$. In this case, we can use a similar argument to the one above (again, assuming $X_i$ is random). Let $C_i = \mathbb{I}\{X_i = \overline{x}\}$. Then $\{C_i\}$ are i.i.d. Bernoulli random variables with expectation $q > 0$. Let $M_n = \max\{i: X^n_{(i)} < \overline{x}\}$. Then by the strong law of large numbers,

$$\lim_{n\to\infty} \frac{M_n}{n} = \lim_{n\to\infty} \frac{n - \sum_{i=1}^n C_i}{n} = 1 - q\text{ a.s.}.$$

But,

$$\lim_{n\to\infty} \frac{n-k_n+1}{n} = 1.$$

Thus, there exists an almost surely finite random variable $M$ such that $n-k_n+1 > M_n$ on the event $\{M \leq n\}$. Finally,

$$\lim_{n\to\infty}\mathbb{P}(F(X^n_{(n-k_n+1)}) = 1) = \lim_{n\to\infty}\mathbb{P}(n-k_n+1 > M_n) = \lim_{n\to\infty}\mathbb{P}(M\leq n) = 1.$$ $$\tag*{$\blacksquare$}$$

This concludes the proof that $F(X^n_{(n-k_n+1)}) \to 1$ in probability. Note that this proof holds very generally. $X_i$ does not need to have any moments at all. Even if it is not integrable, this proof will still be valid. $X$ also does not need to be continuous or discrete. This will work very generally for any Borel measurable random variable in the extended reals ($\mathbb{R}\cup\{\pm \infty\}$).

Also, I proved convergence in probability here. That is, I showed that for any $p < 1$, $\mathbb{P}(F(X^n_{(n-k_n+1)}) > p) = 1 - o(1)$. To get almost sure convergence, we would need to show that the $o(1)$ term is summable, and then we could apply the Borel-Cantelli Lemma.

  • 1
    I somehow fail to understand this:

    you wrote "if $m_n$ is the smallest index such that $X^n_{(m_n)} > x$, then $m_n = pn + o(n)$." This is kind of too vaguely written. There is positive probability that $X^n_{(n)} < x$, and the notation doesn't make much of a sense in that case. Also, what does it mean that $m_n = pn + o(n)$? For fixed $n$ it does not really make sense, and you just write that "for sufficiently large n" which I am not sure how to put it.

    – Albert Paradek May 06 '21 at 10:16
  • Sure, I wrote that fairly quickly last night. Today's busy for me, but I'll take some time tomorrow to make it more rigorous. To directly answer your questions, all of this is in the limit. So for any finite $n$, $X^n_{(n)} <x$ can happen with positive probability. However, for any $p' > p$, $\lim_{n\to\infty}P(X^n_{(p'n)} < x) = 0$ and for any $p'' < p$, $\lim_{n\to\infty} P(X^n_{(p''n)} < x) = 1$. Thus, we expect $\frac{m_n}{n}$ to converge to $p$, which is another way of saying $m_n = pn + o(n)$. – forgottenarrow May 06 '21 at 17:24
  • I rewrote the solution. Is this more clear? – forgottenarrow May 07 '21 at 21:56
  • 1
    Hi, yes it is, thanks for that. I just found out that I need something slightly different actually, and I wasn't able to figure if that can be done similarly to your solution. I posted it on a different page-- if you can solve that I can also give you some bounty for that. https://math.stackexchange.com/q/4138782/615430 – Albert Paradek May 14 '21 at 13:12
  • 1
    Looks interesting. I'm still new to large deviations, but this looks doable. I'll try it out this weekend. – forgottenarrow May 15 '21 at 05:19