Question: filter out low expressed RNA-seq genes using anova
1
gravatar for jfertaj
4.5 years ago by
jfertaj90
United Kingdom
jfertaj90 wrote:

Hi all,

I am analysing RNA-seq data, 37223 genes and 150 samples (RPKM values). I would like to do some co-expression analysis using WGCNA, however I want to filter some genes out with low expressed values first but using some kind of statistical test. I think Partek uses ANOVA and I would like to do the same. I found some code in internet but I don't understand the meaning of dpa

Code:

dpa=as.factor(rep(rep(10,40,by=10),each=3))

fanova <- function(x){anova(aov(x~dpa))$"Pr(>F)"[1]}

result <- apply(dataRNA, 2, fanova)

The example code is for 12 RNA-seq samples (also RPKM values)

Thanks in advance

 

rna-seq R • 2.4k views
ADD COMMENTlink modified 4.5 years ago by mark.ziemann1.2k • written 4.5 years ago by jfertaj90

Can you post the link to the source of this code?

ADD REPLYlink written 4.5 years ago by komal.rathi3.4k
1
gravatar for mark.ziemann
4.5 years ago by
mark.ziemann1.2k
Australia/Mebourne/Geelong/Deakin
mark.ziemann1.2k wrote:

You don't need to run Anova to do non-differential filtering for RNA-seq data. There are generally 2 approaches:

  1. Discard a set proportion of genes (say 30%) or
  2. Even better, discard genes with less than a certain number of reads per sample (across the whole experiment). I use 10 reads per sample as a rule of thumb that works.

Also, no need to use FPKM. The standard tools for RNA-seq DGE use count (integer) data; ie, edgeR & DESeq.

ADD COMMENTlink written 4.5 years ago by mark.ziemann1.2k

Thanks Mark, however I don't have the count data, only RPKM. Also I am not interested in doing DGE, I have data from different tissues but with only one condition (normal tissue). What I would like to do is to create some co-expression networks, one for each tissue, and I think it would be better to remove some low expressed genes. The data is log2(RPKM), some people suggest to remove those genes with log2(RPKM) < 2 but I would like to use some "statistical approach" to do it

 

ADD REPLYlink written 4.5 years ago by jfertaj90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1780 users visited in the last hour