Preprocessing for RNASeq data for CMSclassifier

1

Entering edit mode

6.8 years ago

mas ▴ 10

I am interested in comparing the Consensus Molecular Subtypes (CMS) labels from the random forest and single sample predictor methods from the package CMSclassifier. I have RNASeq data as raw counts as outputed by HTSeq. The CMSclassfier::classifyCMS.RF requires as input "log2_scaled Gene Expression Profiles (GEP) data values". Is it sufficient to log2 the raw counts from HTSeq or would it be more appropriate to also quantile-normalize the log2 values as in the CMScaller package? Or is there a more suitable normalization that you would recommend in these settings?

Thanks!

RNA-Seq R normalization • 2.6k views

ADD COMMENT • link updated 6.7 years ago by Biostar 20 • written 6.8 years ago by mas ▴ 10

1

Entering edit mode

I would go for either rlog or vst transformation, as recommended for downstream analysis in the DESeq2 manual for classification/clustering and machine learning applications.

ADD REPLY • link 6.7 years ago by ATpoint 88k

Login before adding your answer.