In a book I'm reading (Convex Optimization by Boyd and Vandenberghe) it says

I'm struggling to understand the last sentence. Why can one conclude concavity from having a pointwise infimum of a family of affine functions?
In a book I'm reading (Convex Optimization by Boyd and Vandenberghe) it says

I'm struggling to understand the last sentence. Why can one conclude concavity from having a pointwise infimum of a family of affine functions?
Because the Lagrangian $L(x,\lambda,\mu)$ is affine in $\lambda$ and $\mu$, the Lagrange dual function $d(\lambda,\nu) = \inf_{x\in \mathcal{D}}L(x,\lambda,\nu)$ is always concave because it is the pointwise infimum of a set of affine functions, which is always concave. (You can also show that the supremum of a set of convex functions is convex.)
The book referenced is Convex Optimization by Boyd and Vandenberghe. To better see the "pointwise infimum", consider a slight change/abuse of notation: $L_x(\xi) = L(x, \lambda, \nu)$ where $\xi = (\lambda, \nu)$. For a fixed $x$, $L_x(\xi)$ is affine in $\xi$ so $\{L_x(\xi) \,:\, x \in \mathcal{D}\}$ is a family of affine functions and its pointwise infimum is $$g(\xi) = \inf_x \,\{L_x(\xi)\,:\,x\in\mathcal{D}\}$$ Now we can use @A.Γ.'s pointer to show that $g$ is concave by showing that the epigraph of $-g$ is convex. For a given $\xi$, we have $g(\xi) \le L_x(\xi)$ for any $L_x$ from the family so $(\xi, -g(\xi))$ is always "above" $(\xi, -L_x(\xi)$). Hence $\rm{epi}(-g) \subset \bigcap_x \rm{epi}(-L_x)$.
Concavity of the dual function is very much a non-intuitive property.
One way to show it is to use the fact that a function is convex if and only if its epigraph is a convex set. The epigraph of a function $f(\vec x)$ is the set of points 'above' that function: $$\left\{(\vec x,y) \mid y \geq f(\vec x)\right\}$$
For the dual function we have the pointwise infimum of a family of affine functions: $$D(\vec \lambda, \vec \nu) = \inf_{\vec x} \mathcal{L}(\vec x,\vec \lambda, \vec \nu) = \inf_{\vec x} A(\vec x) \begin{bmatrix} \vec \lambda \\ \vec \nu \end{bmatrix} + \vec b(\vec x)$$
That is, we can re-write the Lagrangian to have the form $A\vec \lambda + \vec b$ for some matrix $A$ and vector $\vec b$ that both depend on $\vec x$.
Loosely speaking, pointwise means we can pick different values for $\vec x$ depending on the value of $\vec \lambda$.
The epigraph of $\mathcal{L}$ is for any given value of $\vec x$ is going to be a convex set, as once $\vec x$ is fixed the function is affine, and affine functions are both convex and concave. If we flip this notion, we can look at negative epigraphs, or the set of points 'below' the function. This negative epigraph of $D$ is going to be the intersection of the negative eopigraphs of all possible functions formed by fixing all possible values of $\vec x$. The intersection of convex sets is a convex set, so this negative epigraph is a convex set. The negative epigraph of a function is a convex set if and only if the function is concave, so the dual function $D$ must be concave!
It helps a bit to draw this out on a sheet of paper.
I think it's easier to visualize the maximization case, in which the sup is convex. Say you change a multiplier in the direction that relaxes its constraint, then the lagrangian, being affine on the multiplier, is at least linearly better respect to the change and there is probably room for improvement above linearity because of the relaxation, hence you get convexity. Indeed, the constraint evaluated at the current optimum is a subgradient of the lagrangian. In terms of the epigraph, after changing the multiplier you will stay at least in the same affine member of the family, but you may as well scale up to the next one.