Question: How can I select highly variable genes from the RNASeq gene dataset?
0
gravatar for naseerkhan861
29 days ago by
naseerkhan8610 wrote:

How can I select only the genes with a certain variance threshold, I know techniques like Z statistic and T statistics and then calculation p-value and then correct for FWER or FDR? but is there some modern easy to use techniques that is also available in R with some solid references that I can deploy to my data? I have RNASeq FPKM dataset to be specific.

Regards

rna-seq significant • 138 views
ADD COMMENTlink modified 29 days ago by Michael Dondrup46k • written 29 days ago by naseerkhan8610

what's wrong with the typical Z and T statistics with FDR ???

ADD REPLYlink written 29 days ago by hafiz.talhamalik210

See the great and well explained answer from the person below.

ADD REPLYlink written 29 days ago by naseerkhan8610
1
gravatar for Michael Dondrup
29 days ago by
Bergen, Norway
Michael Dondrup46k wrote:

There is no need to cite simple statistics such as variance or Z statistics, however, you might consider median absolute deviation (MAD) as a robust alternative to variance, afaik this is commonly used as a filtering step in network analysis. If you do the filtering in R or another software, you can add a sentence like "all statistical analyses were done in R (R Core Team (2018))". You might also use other more advanced differential expression statistics like limma, DESeq2, etc. If you are using a particular package, you cite this package.

In general, there is no single best or authoritative way of filtering prior to downstream analyses.

In particular, I do not know what your intended downstream analysis is, so there is not much more I can recommend at this stage.

R Core Team (2018). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

Caveat: do not use FPKM, use CPM or TPM instead (this has been discussed here many times).

ADD COMMENTlink modified 29 days ago • written 29 days ago by Michael Dondrup46k

My objective is to do coexpression analysis after I filter some genes with low variance, is that now more clearer to you now what I want to do?

ADD REPLYlink written 29 days ago by naseerkhan8610

For co-expression analysis, you might use MAD with a certain threshold (e.g. MAD > 2) but remember that such thresholds are necessarily arbitrary.

ADD REPLYlink written 29 days ago by Michael Dondrup46k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1415 users visited in the last hour