RFECV (Recursive Feature Elimination with Cross Validation) grid scores discrepancies
0
0
Entering edit mode
7 months ago
ivnnvi • 0

I would like to know why the grid scores obtained by RFECV (Recursive Feature Elimination with Cross Validation) for nth features do not match the scores when I run RFE and train a model with same number of folds (Cross Validation).

For instance, the grid scores of RFECV tell me that with the top 1 feature I get a F1 Score of 0.60. When 1) I run RFE to select only 1 feature (which should be the same as in RFECV), 2) train the same model fed to RFECV but with the RFE top 1 feature, 3) with CV to get the F1 Score, it is not the same as in the RFECV grid score. The only time it matches is with the top n features selected by RFECV.

Could it be that RFECV is not the same as doing RFE for n features -> run model with top n RFE features and CV -> F1 score?

How I thought RFECV works is as follows: o Perform RFE without CV and select number of features to, for example, 1/10/20. o Then perform Random Forest with this top 1/10/20 selected features using CV and F1 score. o Compare this F1 score with the one reported by RFE CV. o If I get the same numbers, I know how RFE CV works. However, the F1 scores do not match. I also made sure that the cross validation, random states, seeds, number of folds, and performance metrics are consistent for both RFECV and RFE + RF + CV

RFECV RFE CV • 269 views
ADD COMMENT

Login before adding your answer.

Traffic: 3003 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6