The Problem
The problem, I believe, can be stated in the following way: Given $K$ white balls all with without weight (one can say that the weight is $0$) and $N - K$ red balls with individual strictly positive weights $> 0$, if $n \leq N$ balls are picked at uniformly random without replacement, what is the expected weight ratio between the $k$ selected balls' weights and the total sum of all $N-K$ red balls' weights?
Comment: without any knowledge of neither how the weights are distributed among the red balls, nor of how many red balls there are compared to white, I guess that it could be the case that the answer is too general. I have, either way, not been able to come up with the precise algebraic formula for the solution.
My Thoughts & Efforts
I have in my efforts followed two trains of reasoning, none of which if followed could result in the sought after solution. They are mentioned as to point out that my approach has been towards finding ways that could give a rough idea of the bounds of a solution. The calculations are when no reference is found mostly my own calculation, and as such not reviewed by anyone, so consequently both these paths might be and likely are, subject to a high degree of miscalculations. I highly value any input.
The Heaviest Ball
(I) - Firstly, the heaviest ball is expected to be one of the first $\lceil {log\ N} \rceil + 1$ visited elements. This I believe can be shown by letting $X$ denote the random variable such that $X_i = 1$ for $1 \leq i \leq N$ represents the event of the heaviest ball being the $i:th$ ball being evaluated, and $X_i = 0$ of it not being the heaviest. For any specific $j$ the expected value of the heaviest being evaluated can be written as in (1). \begin{equation} \mathbb E[X_i] = 0 \times P[X_i=0] + 1 \times P[X_i=1] = P[X_i=1] \quad \quad (1) \end{equation} If we assume that all such events are mutually independent and that the balls are picked without replacement, then the probability of $P[X_i=1]$ can be written as (2). \begin{equation}\label{fun:fun_ct} P[X_i=1] = \frac{1}{N+1-i} \quad \quad (2) \end{equation} The expected number of tries before an interval from the optimal solution follows from linearity of expectation over the random variables. \begin{align*} \mathbb E[X] &= \sum_{i=1}^n P[X_i]\\ &= \sum_{i=1}^N (\frac{1}{N+1-i}) \\ &< \sum_{i=1}^N \frac{1}{i} \\ &\leq log(N) + 1 \end{align*}
From the Wikipedia page on harmonic series [1] the following relation is stated $\sum_{i=1}^{N} \frac{1}{i} \leq log(N) + 1$ which leads me to believe that if $n \geq log(N) + 1$ the, searched for, expected weight ratio should be bounded by $\frac{1}{N-K}$. If this process is repeated for rounds $r := \lceil \frac{n}{\log(N) + 1} \rceil$ wherein for each round $n$ balls are picked, then by linearity of expectation, should we not receive a weight ratio bounded from above(?) by $\frac{r}{N-K}$?
The Heaviest m Balls
(II) - Secondly, with a arguably severe abuse of terminology, I say that the expected number of red balls after picking $n$ follows from the hypergeometric distribution such that $Y \sim Hypergeometric(N, N-K, n)$ and could, therefore, be expressed, by following the example in [2], algebraically as the expression $\mathbb E[Y] = \sigma n$ where $\sigma := \frac{N-K}{N}$.
If one re-colors the $m$ heaviest red balls green such that $m < n$, then by the same reasoning the expected number of green balls after picking $n$ balls, should be $\hat{\sigma} := \frac{N-m}{N}$. The expected ratio should, therefore, be bounded from below by \begin{equation} L := \frac{\sum_{j=1}^{\hat{\sigma} n} \alpha^{j} + \sum_{j=\hat{\sigma}n +1 }^{\sigma n} \beta^{j}} {\sum_{j=1}^{N-K}\alpha^{j}} \end{equation} where $\beta^j$ is the weight of $j:th$ lightest ball, and from above by \begin{equation} U_1 := \frac{\sum_{j=1}^{\sigma n} \alpha^{j}}{\sum_{j=1}^{N-K}\alpha^{j}} \end{equation} where $\alpha^j$ is the weight of the $j:th$ heaviest ball.
Comment: we have for the upper-bound here settled with the sequence of the heaviest weights, but if one could compute the expected number of index between $n\sigma$ and $n\hat{\sigma}$ and denote it $\mathbb E[I]$, then the upper-bound can be tightened to \begin{equation} U_2 := \frac{\sum_{j=1}^{\hat{\sigma} n} \alpha^{j} + \sum_{j=\hat{\sigma}n + \mathbb E[I]}^{\sigma n + \mathbb E[I]} \alpha^{j}} {\sum_{j=1}^{N-K}\alpha^{j}} \end{equation} It might actually be that case that $\mathbb E[I] = \mathbb E[X]$ I guess, and if that is maybe we could have \begin{equation} U_3 := \frac{\sum_{j=1}^{\hat{\sigma} n} \alpha^{j} + \sum_{j=1}^{\sigma n - \hat{\sigma}n} \alpha^{j(log(N))}} {\sum_{j=1}^{N-K}\alpha^{j}} \end{equation}
Conclusion
Finally, my conclusion is that the expected ratio $\mathbb E[Z]$ must be in-between \begin{equation} L \leq \mathbb E[Z] \leq U_3 \leq U_2 \leq U_1 \end{equation} where $0 \leq Z \leq 1$ is a continuous random variable expressing the ratio in question.
For me as a novice, I struggle with mainly two hurdles. The first hurdle being that nothing can be assumed about the weight distribution. The only thing that I know is that the red balls can be ordered from heaviest to lightest; one can here assume uniqueness, to not have come up with a way break ties, but the end result would be the same.
– Johannes Ringmark Apr 10 '19 at 10:20This possibility of convergence indicates that the ’Heaviest Ball’ formula is pessimistic and that one should be able to express the expected sum more precisely - regardless of whether the worst-case also constitutes the expected case.
– Johannes Ringmark Apr 10 '19 at 10:21