It has been noted in another answer that the first and second moment matches between the linear combination of chi-squared variables and the Welch-Satterthwaite approximation. I think some additional perspective can be provided.
Namely:
- Under what conditions the Welch-Satterthwaite approximation is exactly correct
- The introduction of a bound for mentioned $\nu$ (degrees of freedom for the statistic) mentioned in the question
For which I will define slightly differently
$$
\nu = \frac{\left(\sum_{k=1}^m a_k \sigma_k^2\right)^2}{\sum_{k=1}^m \frac{(a_k \sigma_k^2)^2}{n_k - 1}}
$$
Setup
Unfortunately, I will have to state some notation first:
Define
$$
M := \sum_{k=1}^{m} a_k S^2_k
$$
where the $S^2_k$ is the unbiased sample variance for the $k$-th group of data. A well-known result is $\frac{(n_{k}-1)S_{k}^2}{\sigma_k^2} \sim \chi^2_{n_k-1} $ (e.g. related post). Thus, the defined $M$ is a linear combination of $\chi^2$ random variables. We can re-express $M$ (with $X_k \sim \chi^2_{n_k-1}$)
$$
M := \sum_{k=1}^m \frac{a_k\sigma_k^2}{n_k-1} X_k
$$
(this will be useful later).
The idea of Welch-Satterthwaite approximation is to approximate $M$ with a single scaled chi-squared rv. Define a random variable $L$ (contrast this with the $L$ statistic stated in the original post) s.t.
$$
\frac{\nu L}{ \alpha } \sim \chi_\nu^2, \text{ where } \alpha := \sum^m_{k=1} a_k \sigma^2_k
$$
Note, $\nu $ was already defined above, but now we can also rewrite it as
$$
\nu = \frac{\alpha^2}{\sum_{k=1}^m \frac{(a_k \sigma^2_k)^2}{n_k-1} }
$$
When Welch-Satterthwaite approximation is exactly correct (i.e. $M \sim L$)
Lets set a condition such that $\forall k, \frac{a_k \sigma_k^2}{n_k-1} = c$ where $c$ is a constant. Note that the term set to a constant is the scaling factor for all the chi-squared rvs.
Scaled chi-squared rvs are gamma distributed or more precisely, $ c \cdot \chi_\nu^2 \sim \Gamma(\nu/2, 2c)$ (shape-scale parameterization). Since the scale parameters are all equal under our condition,
$$
M \sim \Gamma ( \sum_{k=1}^m \nu_k / 2 , 2c)
$$
(where $\nu_k = n_k -1$ are the degrees of freedom for the individual chi-squared rvs $X_k$)
Re-using the relationship b/w chi-squared and gamma distributed rvs again.
$$
\frac{M}{c} \sim \chi^2_\nu
$$
Since with our condition,
$$
\alpha = c \cdot \Big( \sum_{k=1}^m n_k- 1 \Big)
$$
$$
\nu = \frac{1}{c} \cdot \alpha = \sum_{k=1}^m n_k - 1
$$
which is the sum over the degrees of freedom of the individual chi-squared rvs. So the degrees of freedom match to that of $\frac{\nu L}{\alpha}$.
Also note $c = \frac{\alpha}{\nu}$, so
$$
\frac{M}{c} = \frac{\nu M }{\alpha} \sim \chi^2_\nu
$$
which is the same as our defined $L$.
Bounds for $\nu$
There exists a bound for $\nu$ and it is:
$$
\min_k{\nu_k} \lt \nu \le \sum_{k=1}^m \nu_k
$$
We have already seen in what scenario the inequality on the right is an equality. It is the scenario in which the Welch-Satterthwaite approximation is exactly correct.
First the left inequality. Recall,
$$
\nu = \frac{\alpha^2}{\sum_{k=1}^m \frac{(a_k \sigma^2_k)^2}{\nu_k} }
$$
Let
$$
\nu^* := \min_k{\nu_k}
$$
Then
$$
\nu \ge \frac{\alpha^2}{\frac{1}{\nu^*} \sum_{k=1}^m a_k \sigma^2_k } = \nu^* \alpha
$$
$$
\implies \nu \gt \nu^*
$$
The right inequality can be proven with Cauchy-Schwarz. Define vectors:
$$
u := ( \sqrt{\nu_1} \dots \sqrt{\nu_m})
$$
$$
v := \Big(\frac{a_1 \sigma_1^2}{\sqrt{\nu_1}} \dots \frac{a_m \sigma_m^2}{\sqrt{\nu_m}} \Big)
$$
with Cauchy-Schwarz we have
$$
(u \cdot v)^2 \le (u\cdot u) (v \cdot v) \\
$$
$$
\implies \alpha^2 = \Big( \sum_{k=1}^m a_k \sigma_k^2 \Big)^2 \le \Big(\sum_{k=1}^m \nu_k \Big) \Big( \sum_{k=1}^m \frac{(a_k \sigma_k^2)^2}{\nu_1} \Big)
$$
$$
\implies \nu = \frac{\alpha^2}{\sum_{k=1}^m \frac{(a_k \sigma_k^2)^2}{\nu_1}} \le \sum_{k=1}^m \nu_k
$$
Conclusions
There is a scenario s.t. the Welch-Satterthwaite approximation is exactly correct. There exists a (not very tight) bound for the degrees of freedom. Also, it was mentioned elsewhere that the 1st and 2nd moments match (so I do not prove this). It was also mentioned elsewhere that the $\sigma_k^2$ terms tend to be unknown and are thus replaced with the $S_k^2$ hence why the $\nu$ stated in my post is different.
It should be noted that this still does not provide us a way to tell us how "good" the Welch-Satterthwaite approximation is if our condition ($\forall k, \frac{a_k \sigma_k^2}{n_k-1} = c$) is not true.