Here's one perspective, involving the notion of distance on function spaces.
Think of different norms on function spaces as giving us different ways of measuring approximations. The basis for this intuition is that norms measure distance between functions and so they let us know in some sense how good approximations are. Existence theorems whose proofs are ultimately based on considering sequences (in particular, the Riesz representation theorem and its derivatives like Lax-Milgram[1]) are sensitive to the method we use to determine "closeness" of approximations.
Often in searching for weak or strong solutions to PDEs we want to use information about the PDE (such as its order). We incorporate this information by choosing the function space with the best metric as the setting of our study. Once we have an appropriate setting, we can effectively use existence theorems like Riesz or Lax-Milgram to study existence and properties of solutions.)
To make this concrete, let's consider a positive, symmetric unbounded operator $T$ defined on a dense domain $D$ in a Hilbert space $H$. The basic example is the Laplacian $T = \Delta = -\sum\partial_i^2$ defined on compactly supported smooth functions $D = C^\infty_0(\Omega)$ in a two-dimensional bounded domain $\Omega\subset\Bbb{R}^2$, and $H = L^2(\Omega)$.
The equation $\Delta u = f$ is really a question about linear algebra in $H$: "Given $f\in H$, can we find $u\in D$ such that $Tu = f$?" If $T$ were bounded, we'd be able to use the Riesz representation theorem to conclude that a weak solution is indeed an actual solution. However, $T$ is not bounded --- that is, the metric on $H$ is not suitable for finding a solution to the equation.
We can define a new norm on $D$ by setting $\|u\|_T^2 = (u,u) + (Tu,u)$. Since $T$ is symmetric and positive this is in fact an inner product. Take the completion of $D$ with respect to this norm; call it $V.$ In fact the extension of the inclusion map $D\hookrightarrow H$ is a compact embedding. (Compactness is not evident; it follows from delicate PDE estimates and is called the Rellich-Kondrachov theorem. The space $V$ is in this case the Sobolev space $H^1_0(\Omega)$.)
In this new norm where the measurement of distance includes both $L^2$ and the action of $T$, the Riesz representation theorem now guarantees a bounded weak solution operator $S$ to the equation $(Tu,\cdot) = F$ where $F$ is a bounded functional on $V$. It turns out that $H\to V^*$ defined by taking $f$ to $(f,\cdot)$, where $\cdot$ is considered as an element of $H$, is a compact operator. By composing it with the solution operator we have a compact solution operator $H\to V$ which gives solutions to the weak equation.
(Note that I have glossed over many details and possible generalizations, which would comprise the majority of a second graduate class on PDEs. These details can be found in Ch 5-6 of Partial Differential Equations by Evans, or Ch 7-8 of Elliptic Partial Differential Operators of Second Order by Gilbarg and Trudinger. Two of the big questions beyond the scope of this answer: What is the proof of Rellich-Kondrachov? When is a weak solution in fact a strong solution? This latter is known as "elliptic regularity.")
[1] The proof of Riesz (by which I mean $H\leftrightarrow H^*$) relies on choice of a vector in one factor of an orthogonal decomposition of $H$; think about how one would compute this.