Which parameters are hyper parameters in a linear regression?

Question

Can the number of features used in a linear regression be regarded as a hyperparameter? Perhaps the choice of features?

TwinPenguins · Accepted Answer · 2018-05-14T07:41:03.977

I like the way Wikipedia generally defines it:

In machine learning, a hyperparameter is a parameter whose value is set before the learning process begins. By contrast, the values of other parameters are derived via training.

On top of what Wikipedia says I would add:

Hyperparameter is a parameter that concerns the numerical optimization problem at hand. The hyperparameter won't appear in the machine learning model you build at the end. Simply put it is to control the process of defining your model. For example like in many machine learning algorithms we have learning rate in gradient descent (that need to be set before the learning process begins as Wikipedia defines it), that is a value that concerns how fast we want the gradient descent to take the next step during the optimization.

Similarly as in Linear Regression, hyperparameter is for instance the learning rate. If it is a regularized Regression like LASSO or Ridge, the regularization term is the hyperparameter as well.

Number of features: I would not regard "Number of features" as hyperparameter. You may ask yourself whether it is a parameter you can simply define during the model optimization? How you set the Number of features beforehand? To me "Number of features" is part of feature selection i.e. feature engineering that goes before you run your optimization! Think of image preprocessing before building a deep neural network. Whatever image preprocessing is done is never considered hyperparameter, it is rather a feature engineering step before feeding it to your model.

Mankind_2000 · Answer 2 · 2018-05-14T21:08:08.653

Hyper-parameters by definition are input parameters which are necessarily required by an algorithm to learn from data.

For standard linear regression i.e OLS, there is none. The number/ choice of features is not a hyperparameter, but can be viewed as a post processing or iterative tuning process.

On the other hand, Lasso takes care of number/choice of features in its formulation of the loss function itself, so only hyper-parameter for it would be the shrinkage factor i.e lambda

score 1 · Answer 3 · edited Sep 22 '19 at 02:18

1

The features from your data set in linear regression are called parameters. Hyperparameters are not from your data set. They are tuned from the model itself. For example, the level of splits in classification models.

For basic straight line linear regression, there are no hyperparameter.

edited Sep 22 '19 at 02:18

Stephen Rauch

1,831
11
23
34

answered Sep 22 '19 at 00:33

Anthony

11
1

Which parameters are hyper parameters in a linear regression?

3 Answers3