function $f(y)$ is made of two pieces: $\frac{|x-y|^2}{2t}$ and $|y|$. the first piece is like a quadratic bowl centered at $y=x$. it's smooth and strictly convex since $t>0$. the second piece, $|y|$, is the standard norm. it's convex, but it has a sharp point at $y=0$, so it's not differentiable there.
when you add a strictly convex function and a convex function, you get a strictly convex function. this is great because it means $f(y)$ has a unique minimum point, let's call it $y^*$. also, $f(y)$ grows large as $|y|$ gets large because of the $|y|^2$ term hidden in $|x-y|^2$, so a minimum must exist.
here's how we find this minimum $y^*$. the standard approach is to find where the gradient is zero. let's compute the gradient of $f(y)$ with respect to $y$:
$\nabla_y \left( \frac{|x-y|^2}{2t} \right) = \frac{1}{2t} \nabla_y((x-y)\cdot(x-y)) = \frac{1}{2t} (2(x-y)\cdot(-1)) = \frac{y-x}{t}$.
the gradient of $|y|$ is $\nabla_y |y| = \frac{y}{|y|}$.
however, the gradient of $|y|$ only exists if $y \neq 0$.
so, let's consider two possibilities for the minimum $y^*$:
case 1: the minimum occurs at $y^* \neq 0$.
if the minimum isn't at the origin, then $f(y)$ is differentiable at $y^*$. in this case, the minimum must be where the gradient is zero:
$$ \nabla_y f(y^*) = \frac{y^*-x}{t} + \frac{y^*}{|y^*|} = 0 \quad (*) $$
this is exactly your equation (3). so, this equation relies on the assumption that the minimum $y^*$ is not zero.
solve (*) for $y^*$.
$$ \frac{y^*}{t} + \frac{y^*}{|y^*|} = \frac{x}{t} $$
$$ y^* \left( \frac{1}{t} + \frac{1}{|y^*|} \right) = \frac{x}{t} $$
$$ y^* = \frac{x/t}{1/t + 1/|y^*|} = \frac{x}{1 + t/|y^*|} $$
this tells us that $y^*$ must be a positive multiple of $x$. let's write $y^* = c x$ for some $c > 0$. (we need $x \neq 0$ for $y^* \neq 0$. if $x=0$, the equation becomes $\frac{y^*}{t} + \frac{y^*}{|y^*|} = 0$, which has no solution for $y^* \neq 0$ since $t>0$ and $|y^*|>0$).
substituting $y^* = c x$ into the equation (assuming $x \neq 0$):
$$ c x = \frac{x}{1 + t/|c x|} = \frac{x}{1 + t/(c|x|)} $$
divide by $x$ (since $x \neq 0$):
$$ c = \frac{1}{1 + t/(c|x|)} $$
$$ c (1 + t/(c|x|)) = 1 $$
$$ c + \frac{t}{|x|} = 1 $$
$$ c = 1 - \frac{t}{|x|} $$
so, the potential minimizer is $y^* = \left(1 - \frac{t}{|x|}\right) x$.
now, for our assumption $y^* \neq 0$ to be valid we need $c > 0$.
$c > 0 \implies 1 - \frac{t}{|x|} > 0 \implies 1 > \frac{t}{|x|} \implies |x| > t$.
so, the gradient equation (*) only makes sense and gives us a non-zero solution $y^*$ when $|x| > t$. this is why evans mentions $|x| > t$ in relation to equation (3) – it's the condition under which the minimum occurs away from the non-differentiable point $y=0$.
if $|x| > t$, let's calculate the minimum value $u(x,t) = f(y^*)$:
we have $y^* = (1 - t/|x|) x$.
$x - y^* = x - (1 - t/|x|) x = (t/|x|) x$.
$|x - y^*| = |(t/|x|) x| = (t/|x|) |x| = t$.
$|y^*| = |(1 - t/|x|) x| = |1 - t/|x|| |x|$. since $|x| > t$, we know $1 - t/|x|$ is positive, so $|y^*| = (1 - t/|x|) |x| = |x| - t$.
plugging these in:
$$ u(x, t) = \frac{|x-y^*|^2}{2t} + |y^*| = \frac{t^2}{2t} + (|x| - t) = \frac{t}{2} + |x| - t = |x| - \frac{t}{2} $$
this matches the first case in formula (2).
case 2: the minimum occurs at $y^* = 0$.
what if $|x| \le t$? our calculation above didn't yield a valid $y^* \neq 0$. maybe the minimum is at $y=0$?
we can't use the gradient being zero here, because $f(y)$ isn't differentiable at $y=0$.
however, for a convex function like $f(y)$, the minimum occurs at $y^*$ if and only if the zero vector is in the 'subdifferential' of $f$ at $y^*$. the subdifferential $\partial f(y^*)$ is like a generalization of the gradient for non-smooth points.
the subdifferential of $f(y) = g(y) + h(y)$ where $g(y)=\frac{|x-y|^2}{2t}$ and $h(y)=|y|$ is $\partial f(y) = \nabla g(y) + \partial h(y)$.
at $y=0$, $\nabla g(0) = \frac{0-x}{t} = -x/t$.
the subdifferential of $h(y)=|y|$ at $y=0$ is the set of all vectors $p$ such that $|p| \le 1$. think of it as all possible 'slopes' at the sharp point of the cone $|y|$. let's call this set $B_1(0)$.
so, $\partial f(0) = -x/t + B_1(0)$.
the minimum is at $y^*=0$ if $0 \in \partial f(0)$.
$$ 0 \in -\frac{x}{t} + \{ p \in \mathbb{R}^n : |p| \le 1 \} $$
this means there exists a $p$ with $|p| \le 1$ such that $0 = -x/t + p$.
this is the same as saying $p = x/t$ for some $p$ with $|p| \le 1$.
the condition is simply $|x/t| \le 1$, which means $|x| \le t$.
so, if $|x| \le t$, the minimum of $f(y)$ occurs at $y^* = 0$.
let's find the minimum value in this case:
$$ u(x, t) = f(0) = \frac{|x-0|^2}{2t} + |0| = \frac{|x|^2}{2t} $$
this matches the second case in formula (2).
so
we found that:
- if $|x| > t$, the minimum is $u(x, t) = |x| - t/2$ (attained at $y^* = (1-t/|x|)x$).
- if $|x| \le t$, the minimum is $u(x, t) = |x|^2 / (2t)$ (attained at $y^* = 0$).
this is exactly the formula (2):
$$
u(x, t) = \begin{cases} |x| - t/2 & \text{if } |x| \ge t \\ \frac{|x|^2}{2t} & \text{if } |x| \le t \end{cases}
$$
(note that the formulas agree when $|x|=t$, both giving $t/2$).
so, the reason $|x|>t$ was needed for equation (3) is that's the condition for the minimum to happen away from $y=0$, where you can use the regular gradient. when $|x| \le t$, the pull towards $y=x$ from the first term isn't strong enough to overcome the preference of the $|y|$ term for $y=0$, so the minimum stays at the origin.