I'm reading Semi Riemannian Geometry with applications to relativity by Barret Oneill and I'm trying understand the definition of gradient of a function $f$ in Riemannian Manifold. I know that motivation for define the gradient of a function $f$ in Riemannian Geometry is preserve the fact that $ \langle grad \ f , X \rangle = df(X)$ in $\mathbb{R}^n$, where $X$ is a vector field. On the one hand, $df(X) = \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i$. On the other hand, $$\langle grad \ f , X \rangle = \langle \sum_{i=1}^{i=n} (grad \ f)^i \frac{\partial }{\partial x^i} , \sum_{j=1}^{j=n} X^j \frac{\partial }{\partial x^j} \rangle = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j \langle \frac{\partial }{\partial x^i} , \frac{\partial }{\partial x^j} \rangle = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij}.$$ If $ \langle grad \ f , X \rangle = df(X)$, then $\sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij} \ (*)$, but the author affirms that $$grad \ f := \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} g^{ij} \frac{\partial f}{\partial x^i} \frac{\partial }{\partial x^j}.$$ I know that $g^{ij}$ represent the element of the matrix $G^{-1}$, where $G$ is the matrix of tensor metric, but I don't understand how the conclude that $grad \ f$ is this.
Thank you in advance for any help!
EDIT:
I tried develop the equation $(*)$ and I thought that I got it how $grad \ f$ is defined. I will put my development here.
$\sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} X^i = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} (grad \ f)^i \ X^j g_{ij} \Longrightarrow$
$\sum_{i=j}^{j=n} \frac{\partial f}{\partial x^j} X^j = \sum_{i=1}^{i=n} (grad \ f)^i \left( \sum_{j=1}^{j=n} (g_{ij} X^j) \right)$
In matricial form, we have
$[\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot [X^1 \cdots X^n]^T = [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \cdot [X^1 \cdots X^n]^T$, where $[X^1 \cdots X^n]^T$ is the transpose of matrix $[X^1 \cdots X^n]$, then
$[\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot [X^1 \cdots X^n]^T - [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \cdot [X^1 \cdots X^n]^T = 0 \Longrightarrow$
$\left( [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] - [(grad \ f)^1 \cdots (grad \ f)^n] \cdot G \right) \cdot [X^1 \cdots X^n]^T = 0 \Longrightarrow$
$[(grad \ f)^1 \cdots (grad \ f)^n] \cdot G = [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \Longrightarrow$
$[(grad \ f)^1 \cdots (grad \ f)^n] = [\frac{\partial f}{\partial x^1} \cdots \frac{\partial f}{\partial x^n}] \cdot G^{-1} \Longrightarrow$
$(grad \ f)^j = \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij}$. We can $grad \ f = \sum_{j=1}^{j=n} (grad \ f)^j \frac{\partial }{\partial x^j}$, then $grad \ f = \sum_{j=1}^{j=n} \left( \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij} \right) \frac{\partial }{\partial x^j} = \sum_{j=1}^{j=n} \sum_{i=1}^{i=n} \frac{\partial f}{\partial x^i} g^{ij} \frac{\partial }{\partial x^j}$