Problem on convergence in distribution of a random vector

Question

The issue is that I have to prove the following with limited resources.

Problem: Let $X_n$ and $Y_n$ be p-dimensional random vectors. Show that if $X_n − Y_n \xrightarrow{P} 0$ and $X_n \xrightarrow{D} X$ , where $X$ is a p-dimensional random vector, then $Y_n \xrightarrow{D} X$.

This is from Hogg and McKean's "Introduction to Mathematical Statistics". In their section $5.4$, they address the convergence theorems in multivariate case. However, they have given very limited set of theorems to work with for the multivariate case. Those limited set of theorems and definitions are given at the end of this post. They say that using #$4$ of these theorems, many of the univariate theorems can be extended to multivariate case, some of which are given in the exercises. I'd assume this is one of them.

My attempt: So we have componentwise convergence in probability for $X_n-Y_n$ to $0$, and since projection is continuous, we have componentwise convergence in distribution of $X_n$ to $X$. So we have componentwise convergence in distribution of $Y_n$ to $Y$ because of #$6$. But componentwise convergence is not known to result in convergence of the vector itself as we are not told that the components are independent of each other. So given these resources, how can we prove that statement? Please note that nothing about characteristic function is known or discussed but mgf related theorems are given. However, we are not told if mgf exists for the given vectors. Please help.

Theorems/Definitions at our disposal:

For Vectors:

Let $\{X_n\}$ be a sequence of random vectors that converges in distribution to a random vector $X$ and let $g(x)$ be a function that is continuous on the support of $X$. Then $g(X_n)$ converges in distribution to $g(X).$
Let $\{X_n\}$ be a sequence of random vectors with $X_n$ having distribution function $F_n(x)$ and $X$ be a random vector with distribution function $F(x)$. Then $\{X_n\}$ converges in distribution to $X$ if $\lim_{n \to \infty} F_n(x) = F(x)$, for all points $x$ at which $F(x)$ is continuous. We write $X_n \xrightarrow{D} X.$
Let $\{X_n\}$ be a sequence of p-dimensional vectors and let $X$ be a random vector, all defined on the same sample space. Then $X_n \xrightarrow{P} X$ if and only if $X_{nj} \xrightarrow{P} X_j$ for all $j=1,...,p.$

Here, $X_{nj}$ are the components of the vector $X_n$ and $X_j$ are components of vector $X$.

Let $\{X_n\}$ be a sequence of p-dimensional vectors and let $X$ be a random vector, all defined on the same sample space. We say that $X_n$ converges in probability to $X$ if $ \lim_{n \to \infty} P[\lVert X_n − X \rVert \geq \epsilon] = 0,$ for all $\epsilon > 0$. As in the univariate case, we write $X_n \xrightarrow{P} X.$
Let $v′ = (v_1,...,v_p)$ be any vector in $\mathbb{R}^p$. Then $\lvert v_j \rvert \leq \lVert v \rVert \leq \sum_{i=1}^n \lvert v_i \rvert $, for all $j = 1,...,p .$

For univariate case:

Suppose $X_n$ converges to $X$ in distribution and $Y_n$ converges in probability to $0$. Then $X_n + Y_n$ converges to $X$ in distribution.
If $X_n$ converges to $X$ in probability, then $X_n$ converges to $X$ in distribution.

Edit (Dec 25th 2023): I finally managed to type up the proof based on @Aphelli's hint.

For the random vector $Y_n$, the cdf is expressed as a function of the event $y_{n1}\leq x_1$, $y_{n2}\leq x_2$, ... $y_{np}\leq x_p$ where the random vector $Y_n = (y_{n1},y_{n2},...,y_{np})$ and the vector $x=(x_1,x_2,...,x_p)$ are p-dimensional. Also, the random vector $X_n = (x_{n1},x_{n2},...,x_{np})$. Denote the event $A(Y_n,x) := (y_{n1}\leq x_1$, $y_{n2}\leq x_2$, ... $y_{np}\leq x_p)$ where $A$ is a set function operating on $Y_n$ and $x$. Then $F_{Y_n}(x) = P(A(Y_n,x))$.

The usual trick is to split this probability into two parts. $$\begin{align} P(A(Y_n,x)) &= P(A(Y_n,x),\lVert X_n-Y_n\rVert \leq \epsilon) + P(A(Y_n,x),\lVert X_n-Y_n\rVert > \epsilon) \\ & \leq P(A(Y_n,x),\lVert X_n-Y_n\rVert \leq \epsilon)\ + P(A(Y_n,x),\lVert X_n-Y_n\rVert \geq \epsilon).\end{align}$$

Let $dt=(\epsilon,\epsilon,...,\epsilon)$, a p-dimensional vector whose components are all $\epsilon$. From #5 above, $\lvert x_{nj} - y_{nj}\rvert \leq \lVert X_n - Y_n\rVert$ so if we are given $\lVert X_n - Y_n\rVert \leq \epsilon$, then $\lvert x_{nj} - y_{nj}\rvert \leq \epsilon$. The event $A(Y_n,x) \implies y_{nj} \leq x_j$. Hence, they together imply $x_{nj} \leq x_j + \epsilon$ i.e $P(A(Y_n,x),\lVert X_n-Y_n\rVert \leq \epsilon) \leq P(A(X_n,x+t))$. So now we have $$\begin{align} P(A(Y_n,x)) &= P(A(Y_n,x),\lVert X_n-Y_n\rVert \leq \epsilon) + P(A(Y_n,x),\lVert X_n-Y_n\rVert > \epsilon) \\ & \leq P(A(Y_n,x),\lVert X_n-Y_n\rVert \leq \epsilon)\ + P(A(Y_n,x),\lVert X_n-Y_n\rVert \geq \epsilon) \\ & \leq P(A(X_n,x+t)) + P(\lVert X_n-Y_n\rVert \geq \epsilon).\end{align}$$ Since $X_n \xrightarrow{D} X$ and $X_n − Y_n \xrightarrow{P} 0$, we can conclude that $$\lim \sup F_{Y_n}(x) = \lim \sup P(A(Y_n,x)) \leq F_X(x+dt).$$ In a similar fashion, we can derive $F_{X}(x-dt) \leq \lim \inf F_{Y_n}(x).$ So we have $$F_{X}(x-dt) \leq \lim \inf F_{Y_n}(x) \leq \lim \sup F_{Y_n}(x) \leq F_X(x+dt).$$ Letting $\epsilon \to 0$, we can conclude that $Y_n \xrightarrow{D} X.$

I'd appreciate it if you can please provide feedback as to whether this proof is correct and if it has errors, kindly point it out for me.

Hint: mimic the proof of 6) for the sequences $X_n \overset{D}{\rightarrow} X$, $Y_n-X_n \overset{P}{\rightarrow} 0$. — Aphelli, Nov 16 '23 at 15:32
The issue with mimicking proof of (6) is that unlike the univariate case, the norm of vectors is the square root of sum of squares, and the cdf is multidimensional. That thought did cross my mind but it seemed too complicated. The standard approach there is to first break the probability and find an inequality, and then use that to find an upper bound for the limsup of $F_{X_n}$ in terms of $F_X$. We do something similar for liminf, and find the limit using sandwich theorem. It breaksdown for me in using $|X_n-X|<\epsilon$ with $X_n \leq x$ as these are now vectors. — TryingHardToBecomeAGoodPrSlvr, Nov 16 '23 at 15:53
I think I have to use (5) in order to complete the proof on the lines you suggested. Let me think about it. — TryingHardToBecomeAGoodPrSlvr, Nov 16 '23 at 16:08
Oh, so that’s the proof of 6) that you know… Well, then my hint would be – try to find a proof of 6) that does not involve the criterion 2), but the actual definition of convergence in distribution instead. — Aphelli, Nov 16 '23 at 20:25
In this text, the actual definition of convergence in distribution is #2. I am sorry that I did not mark which ones are definitions, and which are theorems. Other than #2 and #4 which are definitions, rest are all theorems. I scribbled the proof that uses #5 on the lines of proof of #6 as you suggested. It seems like that is the main proof. I will type it as an edit once I feel a bit more confident about it. — TryingHardToBecomeAGoodPrSlvr, Nov 17 '23 at 01:38
@Aphelli Thanks once again for the hint. I finally managed to find time to type up the proof as an edit at the end of the post. I'd appreciate it if you can please take a look and let me know if there are any errors. — TryingHardToBecomeAGoodPrSlvr, Dec 26 '23 at 03:45

Problem on convergence in distribution of a random vector

0 Answers0