I've seen it described in at least a couple places that swapping of min and max in a min-max expression is equivalent to, or at least implied by strong duality.
I'm only really familiar with strong duality in the context of Convex Optimization via Boyd Vandenberghe. But in that context the dual problem would contain terms related to the feasible set so I don't really see how it's simply the swap of min and max.
For example this answer on MathSE states that asking when we can swap min and max is equivalent to asking when strong duality holds.
Furthermore, in the supplemental material of the paper Robust Classification Under Sample Selection Bias NEURIPS 2014 the author states that
\begin{equation} \min_{\hat{P}(Y|X)\in\Delta} \max_{\check{P}(Y|X)\in\Delta\cap\Xi} \mathbb{E}_{P_{trg}(x)\check{P}(y|x)}\left[-\log{\check{P}(Y|X)}\right] \end{equation}
is equivalent to
\begin{equation} \max_{\check{P}(Y|X)\in\Delta\cap\Xi} \min_{\hat{P}(Y|X)\in\Delta} \mathbb{E}_{P_{trg}(x)\check{P}(y|x)}\left[-\log{\check{P}(Y|X)}\right] \end{equation}
when $\Xi$ is convex and a solution exists on the relative interior of the set. They state that this is because strong duality holds, which allows them to switch the order of the min and max.
This justification sounds like Slater's condition, but I don't see how the second expression is the dual of the first.
My guess is that I'm missing something obvious, could someone shed some light on how switching min and max in a min-max expression is related to the concept of strong duality?