I am reading the Lee's Introduction to smooth manifold, Theorem 4.12 and stuck at understanding some statements.
Theorem 4.12 ( Rank Theorem ). Suppose $M$ and $N$ are smooth manifolds of dimensions $m$ and $n$ respectively, and $F:M\to N$ is a smooth map with constant rank $r$. For each $p\in M$ there exist smooth charts $(U,\varphi)$ for $M$ centered at $p$ and $(V, \psi)$ for $N$ centered at $F(p)$ such that $F(U) \subseteq V$, in which $F$ has a coordinate representation of the form $$ \hat{F}(x^1, \dots, x^r,x^{r+1},\dots x^m)=(x^1,\dots , x^r,0,\dots ,0).$$
Proof. Because the theorem is local, after choosing smooth coordinates we can replace $M$ and $N$ by open subsets $U \subseteq\mathbb{R}^{m}$ and $V \subseteq\mathbb{R}^{n}$ . The fact that $D F ( p )$ has rank $r$ implies that its matrix has some $r \times r$ submatrix with nonzero determinant. By reordering the coordinates, we may assume that it is the upper left submatrix, $\left( \partial F^{i} / \partial x^{j} \right)$ for $i, j=1, \ldots, r$ .
Q.1. Why the bold statement is true? How such mechanism works? At first glance it seems work but I can't figure out it rigorously yet. Can anyone explain this more friendly?
( Continuing proof ) Let us relabel the standard coordinates as $( x, y )=$ $\left( x^{1}, \ldots, x^{r}, y^{1}, \ldots, y^{m-r} \right)$ in $\mathbb{R}^{m}$ and $( v, w )=\left( v^{1}, \ldots, v^{r}, w^{1}, \ldots, w^{n-r} \right)$ in $\mathbb{R}^{n}$ By initial translations of the coordinates, we may assume without loss of generality that $p=( 0, 0 )$ and $F ( p )=( 0, 0 )$ . If we write $F ( x, y )=\big( Q ( x, y ), R ( x, y ) \big)$ for some smooth maps $Q \colon U \to\mathbb{R}^{r}$ and $R \colon U \to\mathbb{R}^{n-r}$ , then our hypothesis is that $\left( \partial Q^{i} / \partial x^{j} \right)$ is nonsingular at $( 0, 0 )$ . Define $\varphi\colon U \to\mathbb{R}^{m}$ by $\varphi( x, y )=\big( Q ( x, y ), y \big)$ . Its total derivative at $( 0, 0 )$ is $$ D \varphi( 0, 0 )=\left( \begin{matrix} {{{{\frac{\partial Q^{i}} {\partial x^{j}} ( 0, 0 )}}}} & {{{{\frac{\partial Q^{i}} {\partial y^{j}} ( 0, 0 )}}}} \\ {{{{0}}}} & {{{{\delta_{j}^{i}}}}} \\ \end{matrix} \right), $$
where we have used the following standard notation: for positive integers $i$ and $j$, called the Kronecker delta, is defined by $$ \delta_{j}^{i}=\left\{\begin{matrix} {{1}} & {{\mathrm{i f ~} i=j,}} \\ {{0}} & {{\mathrm{i f ~} i \neq j.}} \\ \end{matrix} \right. \tag{4.4} $$ The matrix $D \varphi( 0, 0 )$ is nonsingular by virtue of the hyposthesis. Therefore, by the inverse function theorem, there are connected neighborhoods $U_0$ of $(0,0)$ and $\widetilde{U}_0$ of $\varphi(0,0) = ( 0,0)$ such that $\varphi\colon U_{0} \to\tilde{U}_{0}$ is a diffeomorphism. By shrinking $U_0$ and $\widetilde{U}_0$ if necessary, we may assume that $\widetilde{U}_{0}$ is an open cube. Writing the inverse map as $\varphi^{-1} ( x, y )=\left( A ( x, y ), B ( x, y ) \right)$ for some smooth functions $A: \widetilde{U}_0\to \mathbb{R}^{r}$ and $B \colon{\widetilde{U}}_{0} \to\mathbb{R}^{m-r}$ , we compute $$ ( x, y )=\varphi\big( A ( x, y ), B ( x, y ) \big)=\big( Q \big( A ( x, y ), B ( x, y ) \big), B ( x, y ) \big). \tag{4.5} $$ Comparing $y$ components shows that $B ( x, y )=y$ , and therefore $\varphi^{-1}$ has the form $$ \varphi^{-1} ( x, y )=\big( A ( x, y ), y \big). $$ On the other hand, $\varphi\circ\varphi^{-1}=\mathrm{I d}$ implies $Q \big( A ( x, y ), y \big)=x$ , and therefore $F \circ\varphi^{-1}$ has the form $$ F \circ\varphi^{-1} ( x, y )=\big( x, \widetilde{R} ( x, y ) \big), $$ where ${\widetilde{R}} \colon{\widetilde{U}}_{0} \to\mathbb{R}^{n-r}$ is defined by $\widetilde{R} ( x, y )=R \big( A ( x, y ), y \big)$ . The Jacobian matrix of this composite map at an arbitrary point $( x, y ) \in{\widetilde{U}}_{0}$ is $$ D \big( F \circ\varphi^{-1} \big) ( x, y )=\left( \begin{matrix} {{{\delta_{j}^{i}}}} & {{{0}}} \\ {{{\frac{\partial\widetilde{R}^{i}} {\partial x^{j}} ( x, y )}}} & {{{\frac{\partial\widetilde{R}^{i}} {\partial y^{j}} ( x, y )}}} \\ \end{matrix} \right). $$
Since composing with a diffeomorphism does not change the rank of a map, this matirx has rank $r$ everywhere in $\widetilde{U}_0$. The first $r$ columns are obviously linearly independent. so the rank can be $r$ only if the derivatives $\partial\tilde{R}^{i}/\partial y^{j}$ vanish identically on $\tilde{U}_0$, which implies that $\widetilde{R}$ is actually independent of $(y^1, \dots, y^{m-r})$. ( This is one reason we arranged for $\widetilde{U}_0$ to be a cube. ). Thus, if we set $S(x)=\tilde{R}(x,0)$ , then we have $$ F\circ \varphi^{-1}(x,y) = ( x, S(x)).$$
(Next proof is omitted ).
Q.2. Why the second bold statement is true? ; i.e., why the first $r$ columns of $D(F \circ \varphi^{-1})(x,y)$ are 'obviously' linearly independent? And for showing that $\tilde{R}$ is actually independent of $(y^1,\dots, y^{m-r})$, how the condition " $\tilde{U_0}$ is a cube ", is used?
Can anyone help?