Note: Also asked on Statistics Stack since I did not get an answer here.
I am trying to understand a paper about regularization in non-parametric regression and I am struggling to understand the RKHS involved there (For reference: The $v ||f||_{\mathcal{H}}$ term in Eq. 5 from Nonparametric Sparsity and Regularization). I know non-parametric regression is more of a statistics topic, but I think RKHS fits in both stacks.
My understanding of RKHS is that there are two perspectives about it (feel free to correct me!):
1.) When given a Hilbert space on $\Omega$, denoted by $\mathcal{H}(\Omega)$, I can investigate wether there are Kernels $K: \Omega \times \Omega \rightarrow \Omega$ satisfying the reproducing property, i.e. $\langle K_x,f \rangle = f(x)$. If I can find such kernels, I can say that my Hilbert space has reproducing kernels and call it RKHS. The example I found two times is the space of sequences with norm $\langle (a_n)_{n \in \mathbb{n}}, (b_n)_{n \in \mathbb{n}} \rangle = \sum_{i = 1}^{\infty} a_i b_i$. Then $K_p = 1_{y = p}$ leads to $\langle (a_n)_{n \in \mathbb{n}}, K_p\rangle = a_p$. So the space has reproducing kernels (compare here Stack: Calculating norm in RKHS and here Wikipedia: RKHS Examples).
2.) Again I am given a Hilbert Space which is not necessarily RKHS, e.g. $L^2(\Omega)$. Then I choose a kernel a priori, e.g. gaussian kernel, and filter out all the functions in my Hilbert space for which the reproducing property can not be fulfilled. So what I am left with are functions which are linear combinations of my kernels, i.e. $\forall f \in \mathcal{H}_K$ I can write $f(x) = \sum_{i = 1}^{\infty} \alpha_i K_{x_i}(x)$ (Stack: Understanding RKHS spaces). This is the point where my questions come in...
Here are my questions:
a) Coming back to the expression $v ||f||_{\mathcal{H}}$, which I first interpreted as a 'RKHS-norm'. Is my understanding correct, that the norm here is same I start with, when I just have a Hilbert space (and did not restrict it to only contain the linear combinations of the kernel)? The only difference is that the functions I can plug into the norm have a specific form now, i.e. lin. comb. of the kernel. So there is nothing like a 'RKHS-norm', it is just any norm on a given Hilbert space but the functions are restricted.
b) Now regarding Regularization and so called RKHS regression: Is my understanding correct, that a term like $v ||f||_{\mathcal{H}}$ can be choosen as a penalty in non-parametric regression because when you try to estimate a function in RKHS regression you write your function as lin. comb. of the kernel with parameter vector $\alpha$, and then $||f||_{\mathcal{H}} = \sqrt{\langle f, f \rangle}$ is proportional to a norm of the parameter vector $\alpha$. The vector norm is then 'induced' by what norm I choose for my Hilbert space, e.g. if I choose the $L^2$ norm on my Hilbert space, $||f||_{\mathcal{H}}$ will be proportional to $||\alpha||_2$.
Are these statements correct?