Ref: "Foundations of Optimization" by Guler.
Thm [Simultaneous diagonalisability of quadratic forms]
Let ${ q _1 (x) := x ^T A x, }$ ${ q _2 (x) = x ^T B x }$ be quadratic forms on ${ \mathbb{R} ^n }$ with ${ q _1 }$ positive definite.
That is, let ${ A, B \in \mathbb{R} ^{n \times n} }$ be symmetric matrices with ${ A }$ positive definite.
Then there exists a basis ${ P = [P _1. \ldots, P _n] }$ of ${ \mathbb{R} ^n }$ such that
$${ {\begin{align} &\, q _1 \left (\sum _{i = 1} ^n P _i y _i \right) = \sum _{i = 1} ^n y _i ^2 \\ &\, q _2 \left( \sum _{i = 1} ^n P _i y _i \right) = \sum _{i = 1} ^n \lambda _i y _i ^2 . \end{align}} }$$
That is, there exists an invertible matrix ${ P \in \mathbb{R} ^{n \times n} }$ such that ${ P ^T A P = I }$ and ${ P ^T B P = \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
Pf: By spectral theorem for the positive definite matrix ${ A , }$
$${ A U = U \text{diag}(\mu _1, \ldots, \mu _n) }$$
with ${ U \in \mathbb{R} ^{n \times n} }$ orthonormal and ${ \mu _1 \geq \ldots \geq \mu _n > 0 . }$
Defining ${ P _1 := U \text{diag}(\mu _1 ^{- 1/2}, \ldots, \mu _n ^{-1/2}) }$ we have ${ P _1 ^T A P _1 = I . }$
The goal is to modify the invertible matrix ${ P _1 }$ satisfying ${ P _1 ^T A P _1 = I }$ to an invertible matrix ${ P _2 = P _1 \tilde{P} }$ satisfying ${ P _2 ^T A P _2 = I }$ and ${ P _2 ^T B P _2 = \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
It suffices to find an orthonormal matrix ${ \tilde{P} }$ such that ${ (P _1 \tilde{P}) ^T B (P _1 \tilde{P}) = \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
It suffices to find an orthonormal basis ${ \tilde{P} }$ such that ${ (P _1 ^T B P _1 ) \tilde{P} = \tilde{P} \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
By spectral theorem for the symmetric matrix ${ P _1 ^T B P _1 , }$ we have an orthonormal basis of eigenvectors ${ \tilde{P} , }$ as needed. ${ \blacksquare }$
The same approach works for simultaneous diagonalisability of a quadratic form and a symmetric operator wrt the quadratic form.
Thm [Spectral theorem]
Let ${ q (x) := x ^T A x }$ be a positive definite quadratic form on ${ \mathbb{R} ^n }$ and let ${ B \in \mathbb{R} ^{n \times n} }$ be a symmetric operator wrt the quadratic form.
That is, let ${ A , B \in \mathbb{R} ^{n \times n} }$ with ${ A }$ a symmetric positive definite matrix and ${ B }$ such that ${ \langle B x, y \rangle _A = \langle x, B y \rangle _A }$ that is ${ B ^T A = A B . }$
Then there exists a basis ${ P = [P _1, \ldots, P _n] }$ of ${ \mathbb{R} ^n }$ such that
$${ {\begin{align} &\, q \left( \sum _{i = 1} ^n P _i y _i \right) = \sum _{i = 1} ^n y _i ^2 \\ &\, B P = P \text{diag}(\lambda _1, \ldots, \lambda _n) \end{align}} }$$
That is, there exists an invertible matrix ${ P \in \mathbb{R} ^{n \times n} }$ such that ${ P ^T A P = I }$ and ${ B P = P \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
Pf: By Gram-Schmidt orthogonalisation construct an orthonormal basis wrt ${ q . }$ Hence we have an invertible matrix ${ P _1 }$ such that ${ P _1 ^T A P _1 = I . }$
Now the constraint ${ B ^T A = A B }$ becomes ${ B ^T (P _1 ^{-1}) ^T P _1 ^{-1} = (P _1 ^{-1} ) ^T P _1 ^{-1} B }$ that is ${ P _1 ^T B ^T (P _1 ^{-1}) ^T = P _1 ^{-1} B P _1 }$ that is ${ P _1 ^{-1} B P _1 }$ is a symmetric matrix.
The goal is to modify the invertible matrix ${ P _1 }$ satisfying ${ P _1 ^T A P _1 = I }$ to an invertible matrix ${ P _2 = P _1 \tilde{P} }$ satisfying ${ P _2 ^T A P _2 = I }$ and ${ B P _2 = P _2 \text{diag}(\lambda _1, \ldots, \lambda _n) . }$
It suffices to find an orthonormal matrix ${ \tilde{P} }$ such that ${ B P _1 \tilde{P} = P _1 \tilde{P} \text{diag}(\lambda _1, \ldots, \lambda _n). }$
It suffices to find an orthonormal basis ${ \tilde{P} }$ such that ${ (P _1 ^{-1} B P _1) \tilde{P} = \tilde{P} \text{diag}(\lambda _1, \ldots, \lambda _n). }$
By spectral theorem for the symmetric matrix ${ P _1 ^{-1} B P _1, }$ we have an orthonormal basis of eigenvectors ${ \tilde{P} , }$ as needed. ${ \blacksquare }$