The image below shows Balloon's pseudo code from its paper, and relevant parameters to my question is delta.
Question: What is its impact on memory hardness?
I've seen it as $\delta=1$ giving you one random block per every predictable block, whereas $\delta=5$ gives you 5 random blocks per every predictable block. So increasing $t$ gives you more of both the predictable and unpredictable blocks, and increasing $\delta$ gives you more of just the unpredictable blocks. So there's a time cost tradeoff there.
From how I've understood things, the unpredictable blocks are what ensures that calculating the hash without enough memory is expensive, more of them means more penalty to low memory implementations. The authors make no security guarantees for any $\delta$ less than three, so use a value of three or above.
There's also attacks presented beyond the ones the authors anticipated. The best attacks I know of are the ones presented in the paper Towards practical attacks on Argon2 and Balloon hashing, and the attack quality of that one is reduced with the number of hashes per block. Authors recommend a higher $t$ value.
Edit with clarification:
The time to calculate the hash with enough memory is proportional to $t\cdot s$, and to do it with less memory gives a time penalty. The authors give a proof that with $\delta \geq 3$, the time to calculate the hash with less memory, $s'<\frac{s}{32}$, is proportional to $t\cdot\frac{s^2}{32s'}$, which means a penalty factor of $\frac{s}{32s'}$. And, as I understand it, $\delta<3$ means a smaller penalty factor and $\delta>3$ means a bigger one. Increasing t does not increase the penalty factor but does instead increase the overall run time. But since we assume that the adversary has access to much much more computing power than us but not nearly as much more memory, we absolutely want a good penalty factor.
Example with $s'=\frac{s}{10^4}$:
$\delta=0$ gives us a total time proportional to $t\cdot s$
$\delta=3$ gives us a total time proportional to $t\cdot\frac{s^2}{32s'} = t\cdot s\cdot\frac{10^4}{32}$
(Using $\delta=0$ in the example since it means no penalty factor and therefore easier maths. $\delta=1$ should be somewhere in-between $\delta=0$ and $\delta=3$. But I'm not good enough at maths to definitively say that.)
So we see that $\delta=3$ is $\frac{10^4}{32}$ ≈ 300 times slower for the adversary than $\delta=0$ is in this example. We would have to increase the $t$ parameter by 300 times to get the same impact as increasing $\delta$ to 3, and that would mean very slow hashing for the honest parties.