I'm doing some machine learning on RNA-seq samples to predict sample group, and using feature selection techniques built into scikit-learn to rank genes. Specifically, I'm using the F-test and Gini importance (built into random forest classifiers). I'm using the variance-stabilized transformed counts outputted from DESeq2. However, I'm noticing that there is very little agreement between the feature-selected genes and the differentially expressed genes (also calculated using DESeq2). I understand that the statistical methods used for differential expression differ from the F-test and Gini importance, but I was wondering if anyone could offer deeper clarification on this, or references to read.