Question
Define the function $f: (0, \infty) \to \mathbb{R}$ by $$f(c) = \min_{x \in \mathbb{R}^n \, : \, \|x\| = c} \|b - A x\|_2^2,$$ for $A \in \mathbb{R}^{m \times n}$ with full rank, $b \in \mathbb{R}^m$, and $\|\cdot\|$ some norm. How do I show that $f(c)$ is differentiable? Is it possible to show that $f''(c) \ge 0$ for $c$ less than $\|\hat{x}\|$, to be defined later?
Thoughts toward solution
Plug into derivative
Plugging this function into the definition of the derivative isn't illuminating.
"Geometric" interpretation
There is a clear geometric interpretation of this problem, since, $$f(c) = \|b - A \hat{x} \|_2^2 + \min_{\|x\| = c} (x - \hat{x})^T (A^T A) (x - \hat{x}),$$ for $\hat{x} = (A^TA)^{+}A^T b \in \arg\min_{x \in \mathbb{R}^n} \|b - A x\|_2^2$ and $\left( \cdot \right)^+$ the pseudo-inverse. Thus, we need only to consider the function \begin{align*} g(c) & = \min_{\|x\| = c} (x - \hat{x})^T (A^T A) (x - \hat{x}) \\ & = c^2 \min_{\|x\| = 1} (x - c^{-1} \hat{x})^T (A^T A) (x - c^{-1} \hat{x}) \\ & = c^2 \left( \mathbf{d}(c^{-1} \hat{x}, \Omega) \right)^2, \end{align*} where $\mathbf{d}(x,y) = \|A(x-y)\|_2^2$ is a pseudo-metric (and a metric if $A$ is "skinny") and $\Omega = \{x \in \mathbb{R}^n \, : \, \|x\|=1 \}$. Notice that $g$ is differentiable if and only if the function $$c \mapsto \mathbf{d}^2(c \hat{x}, \Omega) \tag{*}$$ is differentiable for $c \in (0, \infty)$ and a fixed point $\hat{x}$.
A very special case as an example: note that if $A^T A = I$, $\|\cdot\| = \|\cdot\|_1,$ and $|\hat{x}_j| = |\hat{x}_k|$ for all $j,k$, then the projection $\Pi_\Omega(\hat{x}) = \mathrm{sgn}(\hat{x})$, and $\mathbf{d}^2(c\hat{x}, \Omega) = \left( \sqrt{2} \left| \frac{c}{\|\hat{x}\|}-1 \right| \right)^2$, which is differentiable and has positive second derivative.
Later, I add this more general example:
We will reexpress $\hat{x}$ in a basis where each basis vector is orthogonal to a face of the $\ell_1$ norm level set. Assume that $\|c \hat{x} \|_1 > 1$ so that the point $c \hat{x}$ is outside of the unit ball. Assume without loss of generality that $\hat{x}_j > 0$. Then, since now each component of $c\hat{x}$ measures it's distance from a face, we have that $d(c \hat{x}, \Omega) = \left[ \sum_{j=1}^n (c \hat{x}_j - 1)^2 \mathbf{1}_{c > d_j} \right]^{1/2},$ where $\{d_j\}$ is the set of knots of $\mathbf{proj}_\Omega c\hat{x}$. Therefore, the distance $d$ is has a derivative which increases across each knot, so that it's convex.
Perhaps this is a convex program on part of its domain?
I also think that it could be true that $f(c)$ is decreasing on $(0, \|\hat{x}\|)$ so that $$f(c) \stackrel{?}{=} \min_{x \in \mathbb{R}^n \, : \, \|x\| \leq c} \|b - A x\|_2^2,$$ for $c \in (0, \|\hat{x}\|)$. This would make this a convex program and hence more amenable to analysis. I think this could be true since as $C$ increases within $(0, \|\hat{x}\|)$, the value $f(c)$ will get "closer" to the unconstrained minimum $\|b - A \hat{x}\|_2^2$.
Further comments on question
If possible, it would be interesting to know how general the norm $\|\cdot\|$ could be while still having the result being provable. If it helps to simplify the problem, I'm particular interested in the case $\|x\| = \|x\|_1$. As Rodrigo kindly pointed out, the case of $\|x\| = \|x\|_2$ follows from noticing that "ridge regression" estimator $f(c)$ has a closed form.