This might be a little late, but note you are missing an important assumption on $X$. Namely that it should be compact and Hausdorff. This is important since this implies paracompactness, so we can use the subordinate partitions of unity.
The following proofs are the same as the nLab link in the comments, but as always nLab is a bit of a hard read. I will try and write these in a manner so that one might feel they could have discovered these themselves. Along the way, we will see that compact and Hausdorff are very natural assumptions to make.
Essentially, there are three steps.
- Try and come up with a suitable definition for an inner product
- If $E_1$ is a sub-bundle of $E$, use the aforementioned inner product to define some notion of a orthogonal complement $E_2$, which is suitable enough to imply that it is a vector bundle and that:
$$E\cong E_1 \oplus E_2$$
- Construct a trivial bundle that $E_1$ embeds into
The constructed trivial bundle in the final step is a bit contrived, but the bigger picture here is that this is necessary to define inverses in $K$-theory, so it is good enough to work within that context. Throughout this proof, points in a vector bundle will be referred to as $(x,v)$. Though this is an abuse of notation, as long as you fix $x$ and only vary $v$ this is justified.
Step 1:
Intuition: First let us try and figure out what it is that we actually should try and define. Since the goal is to find another vector bundle that Whitney sums with $E_1$ to $E$, the way we define an "inner product"/"orthogonal complement" should be compatible with the Whitney sum (fiberwise direct sum). This tells us we are looking for something fiberwise, ie: not a "global" map from the topological product $E\times E\to \mathbb F$, but instead it should be the Whitney sum $E\oplus E\to \mathbb F$.
Two good starting points might be the standard inner product on each fiber or the local trivializations. While we are looking for something fiberwise, we will need to somehow capture local information, because later we will need some "source" of continuity in the local trivializations of the orthogonal complement. Therefore, it might make sense to consider the latter instead.
One natural way to glue them all together is to make the assumption that $X$ is paracompact and Hausdorff, so that we may use partitions of unity. This is a standard technique and is often applied elsewhere (As the comments note, the famous proof that all smooth second countable manifolds are Riemannian uses the same technique and is effectively a special case of this step in the smooth setting).
Concretely: Let $\{h_i:\pi^{-1}(U_i)\to U_i \times \mathbb F^n\}$ be the local trivializations. Each local trivialization comes equipped with the standard inner product $\langle-,-\rangle_i$ on each fiber. If $\{\phi_j, U_{i(j)}\}$ is a subordinate partition of unity of the $\{U_i\}$, we may define an "inner product" on $E$ as:
$$\langle-,-\rangle_E: E\oplus E \to \mathbb F$$
$$(x,v)\oplus(x,w)\mapsto \sum_j \phi_j(x)\langle v,w\rangle_{i(j)}$$
Since only finitely many of these sums are nonzero for each $x$, this is well defined.
Step 2:
Intuition: Now we have some notion of "inner product", let us use it to find the "orthogonal complement" of $E_1$ in $E$. Then if we are lucky our definition of inner product is sufficient to show this indeed is a vector bundle and Whitney sums with $E_1$ to $E$.
Concretely: We are given a vector bundle injection $i:E_1 \hookrightarrow E$. This map must be a linear isomorphism on each fiber. Use the inner product to define a new space $E_2$, such that for each fixed $x\in X$, consists of all points $(x,v)$ in $E$ such that:
$$\langle (x,v),i(x,v')\rangle_E = 0$$
for all $(x,v')$ in $E_1$. For a fixed basepoint $x$, every element of the fiber of $x$ in $E$ decomposes as two elements, each in the fibers in $E_1$ and $E_2$, respectively. This decomposition must be unique since the vector bundle inner product is a regular inner product fiberwise. If $E_2$ is indeed a vector bundle, then $E$ fits the definition of the Whitney sum of $E_1$ and $E_2$.
Clearly, each fiber over $E_2$ is a vector space, say with dimension $n_2$. Thus it remains to check the local trivializations on $E_2$. For each point $x$ in $X$, we may always find $n$ continuous sections over all of $X$ into $E$ that are linearly independent at $x$. We may project these onto $E_2$ fiberwise, which gives us $n$ continuous sections over all of $X$ into $E_2$, such that they span the fiber at $x$. We may perform Gram-Schmidt over over the entire space. From this, we obtain $n$ sections over all of $X$ that may be $0$ in some places. We may discard all sections that are $0$ at $x$. Since the original sections spanned the entire fiber, we must be left with $n_2$ sections over the entire space that are linearly independent at $x$. These form a continuous function from $X$ to the space of all $n_2\times n_2$-matrices. Since the determinant is continuous, there is some neighborhood of $x$ where the $n_2$ sections are linearly independent. This gives us a local trivialization at $x$.
The local trivialization part is an augmented proof of Hatcher.
Step 3:
Intuition: One natural "source" of triviality are the local trivializations. Thus we should be motivated to find a way to glue these together somehow.
We may be inspired by the strategy used in step 1. If we have a $n$-dim vector bundle with a partition of unity subordinate to the local trivializations, we can glue all of the local trivializations using direct sums. However, this requires there to be finitely many maps in the partition of unity.
So far the assumptions we have used is that the space is paracompact and Hausdorff. However, this doesn't imply that there are always finitely many functions in any partition of unity. Though, we may note that a subordinate partition of a finite open cover is finite, so compactness might be a reasonable assumption to consider. If we can find any good results, since compact and Hausdorff implies paracompactness, we may state the theorem that way.
Concretely: Since $X$ is compact, we may find a finite subcover of the local trivializations $\{U_i\}$. Since it is also Hausdorff, it is once again paracompact, so there is a partition of unity subordinate to this finite subcover. Since each open set has finitely many functions supported on it, this implies the partition of unity subordinate to the aforementioned finite subcover must be finite too. Say there are $m$ functions $\{\phi_j\}$ in our partition of unity. Using this, we may upgrade our local trivializations $\{h_i\}$ from maps on $\pi^{-1}(U_i)$ to maps on $E$ by:
$$f_j: E\to E\times \mathbb F^n$$
$$(x,v)\mapsto\phi_j(x)(h_{i(j)}(x,v))$$
Since $h_i$ are all fiberwise linear and injective, each $f_j$ retains these properties on their respective supports.
Gluing these together as $f=\oplus_j f_j$, we get:
$$f:E\to E\times \mathbb F^{mn}$$
$$(x,v)\mapsto \bigoplus_j\phi_j(x)(h_{i(j)}(x,v))$$
The map is linear fiberwise since each $f_j$ are. Since at least one $f_j$ is supported at each point, and each $f_j$ is injective fiberwise, $f$ must also be injective fiberwise.
This is a proof from nLab, which is a bit devoid of details, but quite enlightening if one sits down and thinks through it. Hatcher uses a bit more machinery with normal spaces and Urysohn's lemma, which does chug along and some others may find more merit to it.