Question: removing outliers from RNA-seq data
gravatar for jfertaj
3.9 years ago by
United Kingdom
jfertaj90 wrote:

Hi all,

I have a data.frame from a rna-seq experiment, and I would like to remove some outliers. The  data is huge with 350 samples and 32291 genes. The data are log2 RPKM values (I did the log2 because I am planning to do WGCNA analysis and the authors recommend to make a log2 transformation of the data).

I am using the PcaHubert function from rrcov package to find outliers, here is the code I am using:

    df <- read.table("/path/to/file/rpkm.txt")
    dim(df) #32291   352
    df <- df[,-c(1,2)] # first 2 columns have accessory data
    pcaHub <- PcaHubert(t(df))
    outliers <- which(pcaHub@flag=='FALSE')

The outliers would be those samples with the flag `FALSE` after doing the RobustPCA, do you think it is appropriate to remove outliers using this method?

Any comments would be greatly appreciated


wgcna rna-seq outliers R • 3.9k views
ADD COMMENTlink modified 3.7 years ago by Manvendra Singh2.0k • written 3.9 years ago by jfertaj90
gravatar for Deepak Tanwar
3.7 years ago by
Deepak Tanwar3.9k
ETH Zürich, Switzerland
Deepak Tanwar3.9k wrote:

If you are going to use WGCNA package for network analysis, than you would be having the option to remove the outliers(samples). Follow the WGCNA Tutorials.

ADD COMMENTlink written 3.7 years ago by Deepak Tanwar3.9k
gravatar for Manvendra Singh
3.7 years ago by
Manvendra Singh2.0k
Berlin, Germany
Manvendra Singh2.0k wrote:

Yes, I think PCA is also a good choice to remove outliers.

you can also hierarchically cluster the samples on spearman's correlation of gene expression. then it would be easy to detect and remove outliers from dendrogram.

ADD COMMENTlink written 3.7 years ago by Manvendra Singh2.0k

Hello There is this parameter "crit.pca.distances" in function PcaHubert what should be the value for this other than default value. And what is this parameter?

ADD REPLYlink written 2.6 years ago by rajeshkumar_vinod30
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1533 users visited in the last hour