How to determine the cut off threshold when removing the RNA-Seq with low variance?
0
0
Entering edit mode
2.4 years ago

Hi, I am currently using Python doing RNA-Seq Analysis with Cancer Gene Profile, which has 20000 gene rows and 800 sample columns. The dataset has been normalized, but I don't know what percentage of the gene should be removed (or gene with how low the variance) should be removed. How to determine this cut-off threshold ? Thank you

analysis RNA sequence gene • 1.2k views
ADD COMMENT
0
Entering edit mode

This question is a bit hard to answer without knowing what you plan to do with the data. What questions will you be asking? When is a gene not interesting? Will any of your questions involve knowing that any particular gene does not vary across the data set? Are you looking to filter the data set from 20,000 genes to some much smaller number? Have you plotted the distribution of variances to see if there is a natural cut off for your purposes?

ADD REPLY
0
Entering edit mode

What is the way to select differentially expressed genes by variance in a gene profile for clustering purposes?

ADD REPLY
0
Entering edit mode

Perform a differential analysis, please read manuals of e.g. DESeq2 to get started. https://bioconductor.org/packages/release/bioc/html/DESeq2.html

ADD REPLY

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6