22

I know that there is no a clear answer for this question, but let's suppose that I have a huge neural network, with a lot of data and I want to add a new feature in input. The "best" way would be to test the network with the new feature and see the results, but is there a method to test if the feature IS UNLIKELY helpful? Like correlation measures etc?

Zephyr
  • 997
  • 4
  • 11
  • 20
marcodena
  • 1,667
  • 4
  • 14
  • 17

2 Answers2

20

A very strong correlation between the new feature and an existing feature is a fairly good sign that the new feature provides little new information. A low correlation between the new feature and existing features is likely preferable.

A strong linear correlation between the new feature and the predicted variable is an good sign that a new feature will be valuable, but the absence of a high correlation is not necessary a sign of a poor feature, because neural networks are not restricted to linear combinations of variables.

If the new feature was manually constructed from a combination of existing features, consider leaving it out. The beauty of neural networks is that little feature engineering and preprocessing is required -- features are instead learned by intermediate layers. Whenever possible, prefer learning features to engineering them.

Madison May
  • 2,039
  • 2
  • 18
  • 18
1

If you are using scikit-learn, there is a good function available called model.feature_importances_.

Give it a try with your model/new feature and see if it helps. Also look here and here for examples.

Ethan
  • 1,657
  • 9
  • 25
  • 39
Aniket
  • 154
  • 4