1

I am new in machine learning and just learned about feature selection. In my project, I have a dataset with 89% being a majority class and 11% as the minority class. Also, I have 24 features. I opted to use Recursive Feature Elimination with Cross-Validation (RFECV in the scikit-learn package) to find the optimal number of features in the dataset. I also set the 'scoring' parameter to 'f1' since I am dealing with an imbalanced dataset. Furthermore, the estimator I used is the Random Forest classifier. After fitting the data, I had around 12 features with an f1 score of 0.94.

Is using RFECV appropriate for imbalanced datasets?

Ethan
  • 1,657
  • 9
  • 25
  • 39
laguna
  • 11
  • 1

0 Answers0