I have ran a series of simulations using Recursive Feature Elimination (RFE) on random forests (RF) and I obtained some puzzling results. Let's say I run RFE with Cross Validation (CV) and its highest F1 score is sometimes lower than the F1 score I obtain from running RFE and then CV with the same number of folds.
For instance, RFE with CV says the optimal number of features is 90 and its F1 score is 75%, whereas if I independently run RFE selecting only the top 20 features, with the same CV its F1 score is 87%. Why would that happen?
Thank you in advance.