16

Imagine two battalions of $N \gg 1$ soldiers shooting at each other randomly at the same time, that is, each soldier of battalion $A$ chooses uniformly at random a soldier of battalion $B$ and each soldier of battalion $B$ chooses a soldier of battalion $A$, then every soldier shoots at the same time, bullets cross and always hit and kill. Both battalions keep shooting at each other until all the soldiers of one of the battalions have been killed. When $N$ is large, can an asymptotic estimate of the size of the remaining army be computed, like an expectancy and (possibly) $\sigma$?

It is possible to derive a very complicated explicit formula for the number of soldiers left standing, using compositions and numbers of surjections (if there are $k$ soldiers left standing, then this corresponds to a composition of $N$ and a composition of $N-k$), but computing it is very time consuming because the number of compositions grows exponentially, so I was not able to get exact numbers for $N \geq 15$.

However, computer simulations seem to indicate it is probably $N^{1/2}$ or $N^{2/3}$ for the expectancy. Is there an analytic expression for the exponent? If instead of killing with probability $p=1$, we have instead $p \in (0,1)$, is there a formula for the exponent depending on $p$? Or for instance, if $p \ll 1$?

I am asking this question in case somebody is already acquainted with the problem or knows references, as I have heard of it in class (a very long time ago) and if I remember correctly this was very difficult (elite students were struggling on it together, so if they couldn't come up with an answer, I won't either).

This is not a homework question. I have a feeling this problem was lifted from somewhere and would like to find the reference. Otherwise, if someone knows the answer, I'd love to know about it.

___ Late edit ___

Looking at the comments, it seems like some people might want to know what I'm getting personally to check whether we agree or not, which I have not told because it is mainly numerical computations (which might be wrong... though I don't think so as I computed the probabilities individually and symbolically in $\mathbb{Q}$ for small $N$ and they do sum up to $1$ exactly, and also match the comments!)

Sorry in advance, this is going to use some space, but having references can only be beneficial to make sure one is on the right track, at least numerically speaking.

As commented by user @Varun Vejalla, the probability of transiting from state $(M,N)$ to state $(M-m,N-n)$ is indeed given by

$$\frac{M!N!S(N,m)S(M,n)}{(M-m)!(N-n)!M^NN^M}$$

where $S$ denotes the Stirling numbers of the second kind.

To see this, it is easier to think in terms of surjections. The number of surjections from a set of $m$ elements to a set of $n$ elements is $S(m,n)n!$ so the formula above converts to

$$\frac{1}{M^N}\frac{1}{N^M}\binom{M}{m}\binom{N}{n}\#\text{Surj}(M,n)\#\text{Surj}(N,m)$$

which is simpler to understand. Using this formula and Matlab code, I was able to derive the probabilities $P(k)$ for $0 \leq k < N$ for $N$ up to $14$ with exact precision (values in $\mathbb{Q})$.

The first few values for $N=1,2,\ldots,5$ (each row) and $k = 0,1,\ldots,N-1$ were

$\small 1$

$\small [1/2, 1/2]$

$\small [1/2, 73/162, 4/81]$

$\small [195/512, 354691/663552, 54197/663552, 3/1024]$

$\small [1534477969/5400000000, 148142685571/259200000000, 35051681717/259200000000, 10735369/1200000000, 48/390625]$

I cannot include the remaining ones unfortunately because the numerators and denominators are egregiously large. However, I have the following approximate values:

$\small 1.0$

$\small [0.5, 0.5]$

$\small [0.5, 0.451, 0.0494]$

$\small [0.381, 0.535, 0.0817, 0.00293]$

$\small [0.284, 0.572, 0.135, 0.00895, 1.23e^{-4}]$

$\small [0.247, 0.531, 0.2, 0.0211, 7.0e^{-4}, 3.97e^{-6}]$

$\small [0.237, 0.476, 0.242, 0.0429, 0.00235, 4.28e^{-5}, 1.04e^{-7}]$

$\small [0.231, 0.433, 0.262, 0.0679, 0.0065, 2.02e^{-4}, 2.15e^{-6}, 2.29e^{-9}]$

$\small [0.221, 0.406, 0.268, 0.0911, 0.0134, 7.56e^{-4}, 1.42e^{-5}, 9.19e^{-8}, 4.35e^{-11}]$

$\small [0.206, 0.393, 0.266, 0.11, 0.0223, 0.00202, 7.11e^{-5}, 8.43e^{-7}, 3.41e^{-9}, 7.26e^{-13}]$

$\small [0.189, 0.386, 0.264, 0.125, 0.0324, 0.00417, 2.45e^{-4}, 5.62e^{-6}, 4.36e^{-8}, 1.12e^{-10}, 1.08e^{-14}]$

$\small [0.172, 0.38, 0.262, 0.135, 0.0427, 0.00726, 6.25e^{-4}, 2.49e^{-5}, 3.83e^{-7}, 2.0e^{-9}, 3.28e^{-12}, 1.45e^{-16}]$

$\small [0.158, 0.372, 0.263, 0.143, 0.0523, 0.0112, 0.0013, 7.82e^{-5}, 2.17e^{-6}, 2.29e^{-8}, 8.27e^{-11}, 8.71e^{-14}, 1.76e^{-18}]$

$\small [0.146, 0.362, 0.264, 0.149, 0.0609, 0.0157, 0.00235, 1.95e^{-4}, 8.37e^{-6}, 1.65e^{-7}, 1.22e^{-9}, 3.11e^{-12}, 2.11e^{-15}, 1.98e^{-20}]$

and for expectancies, I am getting

$0.0$

$0.5$

$0.549382716$

$0.7066770954$

$0.8693285884$

$0.9969961065$

$1.098679882$

$1.187799867$

$1.272778757$

$1.356679482$

$1.439572893$

$1.520586275$

$1.598895615$

$1.674044657$

So it seems like I am getting the same results as obtained in the comments (@ploosu2). However, I have not been able to find an asymptotic estimate other than through sheer computation (random trials, empirical mean, etc.)

Evariste
  • 2,847
  • Assuming each soldier stands shoulder-to-shoulder with no obstructions between the battalions and landing every shot, both forces will annihilate each other completely. The probability that battalion A wins is the number of soldiers, $N(A)$, defeated by $N(B)$, divided by the total number of soldiers remaining. This is written as, $P(A)=\frac{N(A-B)}{N(A\cup B)}= \frac{N(A)-N(B)}{N(A)+N(B)}$, which is Bertrand's ballot theorem. Maybe look into lattice path enumerations? – Tayler Montgomery Mar 15 '25 at 23:41
  • 3
    @TaylerMontgomery This is not true in my case as I do not assume the armies to coordinate their effort, they just pick uniformly at random at each turn (independently). So for instance, if N = 30, the 30 soldiers of battalion $A$ could pick, say, a single soldier of battalion $B$ with probability $\frac{1}{30^{30}}$, and all the soldiers of battalion $B$ could injectively pick all soldiers of $A$, the result of the shooting would be $29$ soldiers of $B$ remaining. (Of course, this is very unlikely, but the key here is that the soldiers pick independently, so they can shoot the same soldier) – Evariste Mar 15 '25 at 23:54
  • 1
    I would start by looking up the distributions that arise from the “coupon-collector problem”. This is like an iterated version. – Greg Martin Mar 16 '25 at 00:31
  • 3
    https://gwern.net/doc/statistics/probability/1999-kingman.pdf – Christophe Boilley Mar 16 '25 at 09:06
  • @ChristopheBoilley Ooh thank you! This must have been it. It seems like it is slightly different than what I stated, but this is because I am only recalling the problem from memory. (In the paper, it seems like soldiers don't take turns and shoot each other all at the same time, which is more realistic). This would probably be the answer to my question, if it were not for my failing memory (it's been years...). I wish I could thank you more for this – Evariste Mar 16 '25 at 11:44
  • @Evariste My pleasure. – Christophe Boilley Mar 16 '25 at 12:44
  • 1
    You might be interested in Lanchester's Square Law. – awkward Mar 16 '25 at 13:07
  • 1
    Are the following expected values equal to what you're getting? Graph at Desmos: https://www.desmos.com/calculator/qvrspxvsvs (for example for $N=14$ the value $1.674045$) – ploosu2 Apr 15 '25 at 09:39
  • 1
    The probability of going from state $(M,N)$ to $(M-m, N-n)$ [the states here are the sizes of each army] is $\frac{M!N!S(N,m)S(M,n)}{(M-m)!(N-n)!M^NN^M}$ where $S(a,b)$ are the Stirling numbers of the second kind. – Varun Vejalla Apr 15 '25 at 20:48
  • @VarunVejalla Yes! Thank you for this, I updated the body of the question to account for it (I think it is easier to understand by counting surjections, which is equivalent to the Stirling numbers of the second kind). – Evariste Apr 15 '25 at 22:47
  • @ploosu2 Yes! I have updated the body of the question with numerical computations. – Evariste Apr 15 '25 at 22:47

2 Answers2

2

This is a nice, elegant variation on the OK Corral model (with references linked in the other answer)... even if it's a variation only because OP didn't remember the model precisely! (And it's arguably closer to what would actually happen on a battlefield.) I'll present an analysis that I believe is qualitatively correct, and which I even hope leads to the correct analytic exponent... but it's not fully rigorous, and I remain very interested to see if it can be further shored up.

Let's first consider what happens when battalions of size $M, N \gg 1$ face off against each other for a single round. Each soldier in the $M$-battalion survives with probability $$ p_{M,N}=\left(1-\frac{1}{M}\right)^N=\left(\left(1-\frac{1}{M}\right)^M\right)^{N/M}\approx \left(\frac{1}{e}\right)^{N/M}=e^{-N/M}; $$ and of course $p_{N,M}=e^{-M/N}$ by symmetry. These survival events are very nearly independent in this regime (knowing that one soldier survived tells you almost nothing about a different soldier), so we can treat this as a repeated binomial trial: the expected number of survivors is $M e^{-N/M}$, and the variance is $M e^{-N/M} (1 - e^{-N/M})$. Putting this together, we have $$ M \rightarrow Me^{-N/M} \pm \sqrt{M e^{-N/M}(1-e^{-N/M})}, \\ N \rightarrow Ne^{-M/N} \pm \sqrt{N e^{-M/N}(1-e^{-M/N})} $$ after a single round. Ignoring the $\pm$ terms for the moment, we see that that ratio of strengths goes from $M/N$ to $(M/N)\exp(M/N-N/M)$ in a single round. Once $M/N=\alpha$ differs significantly from $1$, then the battle will soon be over: $\alpha \rightarrow \alpha e^{\alpha - (1/\alpha)}$ blows up super-exponentially fast. And during those few final rounds, nothing dramatic happens to the absolute gap in forces, which we'll call $\Delta$ (equal to $|M-N|$): once $M \ll N$, each round will eliminate almost all the remaining $M$-soldiers (as they're grossly outnumbered), and just about $M$ of the $N$-soldiers, leaving $\Delta$ about the same.

So we are mainly concerned with the regime where $M/N = 1+\delta$, with the relative gap $|\delta|\ll 1$. In that regime (large forces, very closely matched), we can simplify to $$ M \rightarrow \frac{M}{e}(1+\delta)\pm c\sqrt{M},\qquad N\rightarrow \frac{N}{e}(1-\delta) \pm c\sqrt{N}, $$ where $c$ is a fixed constant. This yields $$ 1+\delta = M/N \rightarrow (M/N)(1+\delta)/(1-\delta) \approx 1+3\delta: $$ the relative gap increases by a factor of $3$ each round. Because $M$ and $N$ themselves are decreasing by a factor of $e$, that means the absolute gap $\Delta$ increases by a factor of $3/e$ each round. Now we need to know what the initial value of $\Delta$ looks like (once it diverges from $0$), and also how many rounds transpire before we enter the final few rounds where one side is dominating.

I'm going to assert without proof that the initial value of $\Delta$ can be taken to be $\Theta(\sqrt{N})$, and that this will typically occur within a small number of rounds. Certainly that's the gap expected after a single round, just based on the variance of the number of survivors on each side; the issue is that $\Delta$ contains to wander back and forth randomly after the first round, while also tending to grow by a factor of $3/e$. The assertion is that a clear leader tends to emerge in just a handful of rounds, and that at that point the size of the lead is still $\Theta(\sqrt{N})$ and the size of the forces are still $\Theta(N)$.

At that time, then, $\delta$ will be $\Theta(N^{-1/2})$. Now, $\delta$ will grow by a factor of $3$ each round until it becomes comparable to $1$. This will take $\log_3 N^{1/2} = (1/2) \log_3 N = \log N / (2\log 3)$ rounds. And in this many rounds, $\Delta$ will increase by a factor of $$ \left(\frac{3}{e}\right)^{\log N / (2\log 3)} = N^{1/2 - 1/(2\log 3)}. $$ Together with the initial value of $\Delta$, this means that the expected number of survivors (with equally matched forces at the start) grows as $N^\gamma$, where

$$ \gamma = 1 - \frac{1}{2\log 3} \approx 0.544880. $$

The model is simple enough to simulate, of course, though running enough trials to get very small error bars is time-consuming. But numerical experimentation with $10^4 \le N \le 10^5$ appears fully consistent with this value for $\gamma$ (my best estimates place it somewhere between $0.51$ and $0.56$).

mjqxxxx
  • 43,344
  • Thanks a lot for this ! I am accepting it for now despite the gaps in rigor because I put trust in it. It is very convincing and does not use heavy machinery, though there is some "physics". I also remember getting an exponent close to this value (and was wondering whether it was actually $\frac12$ or something more complicated, which the error bars seemed to indicate... so I was not very hopeful to get an analytical formula, but it turns out it exists and is elementary!). I was also thinking of the unrelated $\gamma$ from Euler which is close. – Evariste May 09 '25 at 22:06
0

This is the OK Corral model. The correct scaling is $N^{3/4}$ but much more is known. See e.g. Kingman et al (2003) or Kingman (2016).

van der Wolf
  • 5,743
  • 4
  • 13
  • 5
    No it isn't. In the OK Corral model, a single random person is selected and shoots someone on the other team (and it doesn't matter who they choose). Here everyone shoots simultaneously and all the randomness comes from how they choose targets. In particular, the OK Corral will always have at least one survivor, but OP's model might not. – Especially Lime Mar 20 '25 at 14:11
  • 1
    @EspeciallyLime You are right, my model isn't the OK Corral model, perhaps I have not been clear enough. While I was thankful to be shown related works, I am still interested in a solution to my original problem, which is probably a deformation of the OK Corral model because my recall isn't perfect. However, it seems like it behaves similarly to the OK Corral model but with a different exponent. Maybe there is no analytical formula for it though. Indeed, in my model, there could be zero survivors. – Evariste Mar 20 '25 at 14:52
  • Oh, I see, agreed! – van der Wolf Mar 21 '25 at 15:14