I'm trying to prove the chain rule. Could you please verify if my proof looks fine or contains logical gaps/errors? Thank you so much for your help!
Let $X$ be a metric space and $Y,G$ normed vector spaces. Suppose $f: X \rightarrow Y$ is differentiable at $x_{0}$ and $g: Y \rightarrow G$ is differentiable at $y_{0}:=f\left(x_{0}\right)$. Then $g \circ f: X \rightarrow G$ is differentiable at $x_{0}$, and the derivative is given by $$\partial(g \circ f)\left(x_{0}\right) = \partial g\left(f\left(x_{0}\right)\right) \circ \partial f\left(x_{0}\right)$$
My attempt:
We have $f(x)=f\left(x_{0}\right) + \partial f\left(x_{0}\right)\left(x-x_{0}\right)+r(x)\left\|x-x_{0}\right\|$ for all $x \in X$ and $g(y) = g\left(y_{0}\right)+\partial g\left(y_{0}\right)\left(y-y_{0}\right)+s(y)\left\|y-y_{0}\right\|$ for $y \in Y$. Here $r: X \rightarrow Y$ and $s: Y \rightarrow G$ are continuous at $x_{0}$ and $y_{0}$ respectively. Moreover, $r\left(x_{0}\right)=0$ and $s\left(y_{0}\right)=0$.
Our goal is to find a function $t:X \to G$ such that $$(f \circ g)(x) = (f \circ g) \left(x_{0}\right) + \partial g\left(f\left(x_{0}\right)\right) \circ \partial f\left(x_{0}\right) \left(x-x_{0}\right)+t(x)\left\|x-x_{0}\right\|$$ for all $x \in X$ and that $t$ is continuous at $x_{0}$ and $t(x_0)=0$. We substitute $y=f(x)$ and get
$$\begin{aligned} (f \circ g)(x) &= g\left(y_{0}\right)+\partial g\left(y_{0}\right)\left(f\left(x_{0}\right)+\partial f\left(x_{0}\right)\left(x-x_{0}\right)+r(x)\left\|x-x_{0}\right\|-y_{0}\right)\\ & \quad + s(y)\left\|y-y_{0}\right\|\\ &= g\left(y_{0}\right)+\partial g\left(y_{0}\right)\left(\partial f\left(x_{0}\right)\left(x-x_{0}\right)+r(x)\left\|x-x_{0}\right\|\right)\\ & \quad+ s(y)\left\|y-y_{0}\right\|\\ &= g\left(f(x_0)\right)+\partial g\left(f(x_0)\right) \circ \partial f\left(x_{0}\right)\left(x-x_{0}\right) \\ &\quad+\partial g\left(f(x_0)\right) \circ r(x)\left\|x-x_{0} \right\|+ s(y)\left\|y-y_{0}\right\| \end{aligned}$$
Equalizing $$g\left(f(x_0)\right)+\partial g\left(f(x_0)\right) \circ \partial f\left(x_{0}\right)\left(x-x_{0}\right) + \partial g\left(f(x_0)\right) \circ r(x)\left\|x-x_{0} \right\|+ s(y)\left\|y-y_{0}\right\|$$ and $$(g \circ f) \left(x_{0}\right) + \partial g\left(f\left(x_{0}\right)\right) \circ \partial f\left(x_{0}\right) \left(x-x_{0}\right)+t(x)\left\|x-x_{0}\right\|$$ we get
$$t(x) \|x-x_0\| = \partial g\left(f(x_0)\right) \circ r(x)\left\|x-x_{0} \right\|+ s(y)\left\|y-y_{0}\right\|$$ and consequently $$\begin{aligned} t(x) &= \partial g\left(f(x_0)\right) \circ r(x) + s(y) \frac{\left\|y-y_{0}\right\|}{\left\|x-x_{0}\right\|}\\ &= \partial g\left(f(x_0)\right) \circ r(x) + s(f(x)) \left\| \frac{\partial f\left(x_{0}\right)\left(x-x_{0}\right)+r(x)\left\|x-x_{0}\right\|}{\|x-x_0\|} \right\| \\&= \partial g\left(f(x_0)\right) \circ r(x) + s(f(x)) \left\| \partial f\left(x_{0}\right) \frac{x-x_0}{\|x-x_0\|} +r(x) \right\|\end{aligned}$$ for all $x \neq x_0$. We further define $t(x_0)=0$. It is easy to check that $t$ satisfies our requirement. Hence $\partial(g \circ f)\left(x_{0}\right) = \partial g\left(f\left(x_{0}\right)\right) \circ \partial f\left(x_{0}\right)$.