Given a two-dimensional data set where each point is labeled $ \{0,1\}$, I want to implement a sparse classifier with $L_p \ \text({ 0<p \leq 1) }$.
I have been reading on logistic regression and regularization. Let me give you an example of what I have been working on. The concrete example is: Let $\left((x^{(i)},y^{(i)} )\right)_{i\in \{1,\dots, m\}} $ be my data set with $y^{(i)}\in \{0,1\} $ and $x^{(i)}\in \mathbb{R}^2$. And the cost function I minimized is
$ J(\theta) = - \frac{1}{m} \cdot \sum_{i=1}^m \large[ y^{(i)}\ \log (h_\theta (x^{(i)})) + (1 - y^{(i)})\ \log (1 - h_\theta(x^{(i)}))\large] + \frac{\lambda}{2m}\sum_{j=1}^n \theta_j^2$.
where $h_\theta(x) = \frac{1}{1+e^{-\theta^{T}x}}$. I thought that this would be a good introduction to sparse.
Currently I use a neural networks and was wondering if I am heading in the right direction in understanding sparse methods.
That leaves me with the question:
What is the definition of sparse classifiers? What would be an example?