I am looking for an explicit formula for a sequence. The sequence is generated as follows:
There is a tournament with $10$ teams. In the beginning, all teams have a 0-0 win-loss record. The teams are paired up, and when the first round concludes, $5$ of the teams have a 0-1 record and $5$ of the teams have a 1-0 record. The second round is arranged so that only teams with the same win-loss record are allowed to face each other, and when there are an odd number of teams having a certain win-loss record, the extra team from one win-loss record plays the extra team from the nearby win-loss record. This means that for the second round, four of the 0-1 teams are paired up with each other, $4$ of the 1-0 teams are paired up with each other, and there is one pairing between a 0-1 team and a 1-0 team. We also assume that when there is a pairing across different win-loss records, the team with the better record always wins. This means that at the end of round 2, there will be $3$ teams with a 0-2 record, there will be $4$ teams with a 1-1 record, and $3$ teams with a 2-0 record.
I am looking for an explicit formula giving the win-loss record distribution at round $r$. I am also looking for a general formula for $T$ teams (I used "10" earlier just for illustration purposes).
I have python code which explicitly calculates this sequence for $T$ teams and round $r$, but I would like a mathematical expression for this.
I have an explicit formula for the first $4$ rows using ceiling/floor functions, but the formula does not work beyond $4$ rows. I also have an explicit formula giving the win-loss record in the long term, but it only works after about $T^2$ rows.
In terms of problem solving techniques that I've tried, I've tried various generating series approaches. For example, for the case of $10$ teams, I wrote round $0$ as
$$f_0(x,y) = y^0 + y^1 + y^2 + ... + y^9$$
and round $1$ as
$$f_1(x,y) = y^0 + y^1 + y^2 + y^3 + y^ 4 + (y^5 + y^6 + y^7 + y^8 + y^9)x$$
and round $2$ as
$$f_2(x,y) = y^0 + y^1 + y^2 + (y^3 + y^4 + y^5 + y^6)x + (y^7 + y^8 + y^9)x^2$$
and so on. I don't have a method for moving from $f_k$ to $f_{k+1}$, but I have lots of near misses.
I have also tried using a probabilistic approach. I have started from several invariants, and I have a function that when minimized gives a (sometimes) close approximation. The invariants are as follows:
The number of teams $T$ does not change from round to round.
The number of total wins at each round is always $T r / 2$.
The win-loss record is symmetric around $r/2$.
These invariants restrict the number of possible choices for round $r$. I then took the choice that most nearly matches a binomial distribution (using the euclidean distance as a metric, although any similarity metric could work). This metric isn't best, since I believe the win-loss records become "less binomial" over time.
Another thing I tried was various kinds of graphing. I can post some graphs if requested.
I am open to any approach, I look forward to seeing your ideas.
Updates and extra information
Here are the first few terms of the sequence for $T=10$:
$$10$$
$$5, 5$$
$$3, 4, 3$$
$$2, 3, 3, 2$$
$$1, 3, 2, 3, 1$$
$$1, 1, 3, 3, 1, 1$$
$$1, 0, 3, 2, 3, 0, 1$$
$$1, 0, 1, 3, 3, 1, 0, 1$$
$$1, 0, 0, 3, 2, 3, 0, 0, 1$$
$$1, 0, 0, 1, 3, 3, 1, 0, 0, 1$$
$$...$$
Another clarification is that the number of teams $T$ must be even.
Also, I would like to add the formula for the long-term behavior of the sequences. The long-term behavior shows a 2-term repetition, and this repetition depends on the value of $T \pmod 8$. Here are the four cases:
Case 1. ($T \equiv 0 \pmod 8$): $1444...444...441$, $3444...4444....443$
Case 2. ($T \equiv 2 \pmod 8$): $3444...424...443$, $1444...4334....441$
Case 3. ($T \equiv 4 \pmod 8$): $3444...444...443$, $1444...4444....441$
Case 4. ($T \equiv 6 \pmod{8}$): $1444...424...441$, $3444...4334....443$
Note that I have left off the preceding $1000...$ and the trailing $...0001$, since the number of zeros can be easily calculated. Not also that the number of $4$s can be calculated easily from the original number of teams $T$.
Regarding the generating series idea previously, one thing I found is that if we have for example
$$f_2(x,y) = y^0 + y^1 + y^2 + (y^3 + y^4 + y^5 + y^6)x + (y^7 + y^8 + y^9)x^2$$
then we can calculate something very close to $f_3(x,y)$ by performing the following steps:
$$\frac{f_2(x,y) + f_2(x,-y)}{2} = y^0 + y^2 + (y^4 + y^6)x + (y^8)x^2$$
$$\frac{f_2(x,y) - f_2(x,-y)}{2} \cdot x = y^1 x + (y^3 + y^5) x^2 + (y^7 + y^9) x^3$$
$$\widetilde{f}_3(x,y) = \frac{f_2(x,y) + f_2(x,-y)}{2} + \frac{f_2(x,y) - f_2(x,-y)}{2} \cdot x = y^0 + y^2 + (y^1 + y^4 + y^6)x + (y^3 + y^5 + y^8)x^2 + (y^7 + y^9)x^3$$
which when evaluated at $y = 1$ gives the correct sequence $2,3,3,2$. It is not exactly correct, since the $y$ terms are no longer ordered in terms of exponents, so this process is not repeatable unless we improve on it somehow.
Another Update
Let's add a variant to the original rules. The variant will be called "underdog", whereas the original rules will be called "overdog". The "overdog" rules stipulate that the team with the higher win-loss record will always win when it crosses over and faces a team with a lower win-loss ratio; in contrast, the "underdog" rules stipulate that the team with the lower win-loss ratio will win.
Let's call the overdog sequence with $T$ teams the sequence $O(T)$, and each term in the sequence will be identified as $O_{r,w}(T)$ where $r$ denotes the round and $w$ denotes the number of wins at that round. For the underdog variant, we define $U(T)$ and $U_{r,w}(T)$ in the same way.
I just discovered the following properties:
$$O(2^k + 2) = U(2^k) + O(2)$$
$$O(2^k + 4) = U(2^k) + U(2) + O(2)$$
This is helpful since $U(2^k)$ and $O(2^k)$ are easy to calculate for $k$ rows, and $O(2)$ and $U(2)$ are easily calculated for all rows.
I'm wondering if we can use the binary representation of $T$ and some kind of summing procedure to improve the number of rows that we can directly calculate. It would be fascinating to try to reduce $O(T)$ and $U(T)$ to the sum of terms of the form $U(2^K)$ or $O(2^K)$ for any arbitrarily large $K >> 0$. This would give us a way to calculate arbitrarily many rows for any $T$.
Correction to Previous Update
I should edit my previous update to be more precise. The powers of two were a bit of a red herring — all that matters is $T \pmod{4}$. Here are the updated relations:
$O(4k + 2) = U(4k) + O(2)$.
$U(4k + 2) = U(4k) + U(2)$.
$O(4k + 4) = U(4k) + U(2) + O(2)$.
This means that the problem is essentially reduced to finding $U(4k)$ instead of finding $U(2k)$. Not much of a reduction, but it's progress!