There's a slight mismatch in terminology as you have observed. The term "Legendre transform" is being used for two things with different meanings.
Say $X,Y$ are two vector bundles over $Q$ and $f:X\to Y$ is a smooth fiber-preserving map (not necessarily fiberwise linear). Then, we can define the object $\mathbf{F}f:X\to \text{Hom}(X,Y)$, by setting for each $x\in Q$ and $v\in X_x$,
\begin{align}
(\mathbf{F}f)(v):= D\left(f|_{X_x}\right)(v)\in \text{Hom}(X_x,Y_x)=\text{Hom}(X,Y)_x
\end{align}
One can show this is a smooth fiber-bundle morphism. I would prefer to call this object the Fiber Derivative. This is really the appropriate terminology because we're restricting $f$ to the fiber to get the mapping $f|_{X_x}:X_x\to Y_x$ between vector spaces, and we're taking the usual derivative of such an object.
Now, to such a morphism $f:X\to Y$, we can define another morphism $E_f:X\to Y$ as
\begin{align}
E_f(v):= (\mathbf{F}f(v))(v) - f(v)
\end{align}
The reason for the symbol $E$ is that it's kind of like the "energy mapping associated to $f$".
Now, suppose that the fiber derivative $\mathbf{F}f:X\to \text{Hom}(X,Y)$ is a fiber-bundle isomorphism (for which it is necessary that $Y$ have one-dimensional vector spaces as its fibers, so that $X_x$ and $\text{Hom}(X,Y)_x$ have the same vector space dimension). In this case, we can consider the mapping $\lambda_f:= E_f\circ (\mathbf{F}f)^{-1}:\text{Hom}(X,Y)\to Y$. Classically, it is this mapping $\lambda_f$ which is called the Legendre-transform of $f$. So, given the function $f$, we consider its energy ($E_f$), and then change variables (compose with $(\mathbf{F}f)^{-1}$).
As a special case, suppose $L:TQ\to \Bbb{R}$ is a smooth function (the Lagrangian, which we can trivially think of as a fiber-bundle map $TQ\to Q\times \Bbb{R}$, $v\mapsto (x,f(v))$, hence everything above can be applied). Then the fiber derivative is $\mathbf{F}L:TQ\to T^*Q$; in terms of bundle coordinates, it is
\begin{align}
(x^1,\dots, x^n,v^1,\dots, v^n)\mapsto \left(x^1,\dots, x^n; \frac{\partial L}{\partial v^1}(x,v),\dots, \frac{\partial L}{\partial v^n}(x,v)\right)
\end{align}
Now, the energy function is $E=E_L:TQ\to\Bbb{R}$, which by unwinding the definitions, can be written in coordinates as
\begin{align}
(x,v)\mapsto v^i\frac{\partial L}{\partial v^i}(x,v)-L(x,v)
\end{align}
Or if we resort to the classical notation $(q^1,\dots, q^n,\dot{q}^1,\dots, \dot{q}^n)$ for the coordinates on $TQ$, then
\begin{align}
E&=\dot{q}^i\frac{\partial L}{\partial \dot{q}^i}-L,
\end{align}
which is a formula for the energy of a Lagrangian system, which is present in one of the first few pages of Landau and Lifshitz for example.
Now, if we make the assumption that the fiber derivative $\mathbf{F}L:TQ\to T^*Q$ is a diffeomorphism (typically called a hyperregular Lagrangian), then we can consider the function $H:= E\circ (\mathbf{F}L)^{-1}:T^*Q\to\Bbb{R}$, and this is what we call the Hamiltonian function associated to the Lagrangian $L$. It is this function $H$ (defined on a completely different space) that is usually referred to as "the Legendre transform of $L$" (in coordinates, people often write $H=\dot{q}^ip_i-L$).
So, there's two things to distinguish: the first is the fiber derivative, the second is the Legendre transformation (which is the composition of the "energy" by the inverse of the fiber-derivative). Often though, people may use "Legendre transform" to mean both these things.