I have a RNASeq dataset of 40K genes about 800 individuals. For many genes (around 30% of the total) the expression value is equal to 0 in almost every individual. I.e the expression level values for the Gene X is equal to 0 for 90% or 95% of the individuals. How should I deal with these Genes? Should I simply remove them from my analysis? I want to use this data to build a predictor (classifier). So each individual has a class value (i.e Control vs Target) and I use the gene expression values to train my model.