This old answer of mine is related.
The usual approach - of shifting to an appropriate conservative extension - works fine here. The key observation is that a statement of the form "$M\models\varphi$" is best understood as an existential quantification over objects one type higher than $M$ (namely, "there is a family of Skolem functions for $\varphi$ in $M$").
In particular, this means that $ACA_0$ - the usual go-to expansion of PA - can directly talk about countable model theory: the structures we look at are coded by sets of natural numbers, and "$\models$" is a $\Sigma^1_1$ relation. $ACA_0$ is strong enough to prove basic facts about model theory under this approach; in particular, it proves compactness/completeness (this actually only takes $WKL_0$, which is of strictly weaker consistency strength) and "weak" bivalence: the scheme consisting of, for each sentence $\varphi$, the sentence "For each $M$ we have $M\models\varphi$ or $M\models\neg\varphi$." On the other hand, there are some basic principles $ACA_0$ can't prove:
"Strong" bivalence - the single sentence "For every sentence $\varphi$ and every structure $M$ we have $M\models\varphi$ or $M\models\neg\varphi$" - is equivalent over $RCA_0$ to the statement "For every $X,n$, the $n$th jump of $X$ exists," which is strictly stronger than $ACA_0$. (If I recall correctly this theory is denoted "$ACA_0^*$.") A key point here is the computability-theoretic analysis of Skolem functions: we show that we can build Skolem functions uniformly from the appropriate number of Turing jumps.
"Every structure has a theory" is even stronger: it's equivalent over $RCA_0$ to $ACA_0^+$, which is $RCA_0$ + "For every $X$, the $\omega$th jump of $X$ exists." (The point is that $Th(\mathbb{N};+,\cdot,X)$ "is" just $X^{(\omega)}$.)
And of course there's the fact that this still doesn't let us handle uncountable structures directly, which are important. But that's also an issue with mathematics in general - we just need to look for higher-order conservative extensions (like higher reverse mathematics' $RCA_0^\omega+\mathcal{E}_1$).
At this point it's worth mentioning some work of Victor Harnik:
In $1985$, Harnik studied the reverse math of some theorems of model theory ...
but in $1987$ he passed to a richer language to handle uncountable structures.
Barring some unpublished notes of Harvey Friedman from the late $70$s, I believe this was the earliest work in higher-order reverse mathematics; however, it was not followed up on at the time to the best of my knowledge, and the modern approach looks rather different.