Question: DEG detection with datasets generated with different sequencing protocol
0
gravatar for Vanilla
2.4 years ago by
Vanilla80
Hong Kong
Vanilla80 wrote:

Hi all!

I'm going to detect differentially expressed genes with RNA-seq data got from some "GFP positive cells" and "GFP negative cells". However, the cDNA are sequenced with two different methods, one as "normal" RNA-seq, and the other is low-input RNA-seq (only requires a small amount of starting materials). Here's the summary of number of dataset I got in each cell type with each method:

GFP positive cells * normal RNA-seq : 1

GFP negative cells * normal RNA-seq : 1

GFP positive cells * low-input RNA-seq : 3

GFP negative cells * low-input RNA-seq : 2

In such a case, what kind of statistics/tools can be applied to detect DEGs in GFP positive cells VS GFP negative cells?

Thanks all!

statistics rna-seq degs • 782 views
ADD COMMENTlink modified 2.4 years ago by Santosh Anand5.1k • written 2.4 years ago by Vanilla80
3
gravatar for Santosh Anand
2.4 years ago by
Santosh Anand5.1k
Santosh Anand5.1k wrote:

You need to run a two factor design, the factors being GFP +/- status and low-input/normal sequencing. See http://www.bioconductor.org/packages/3.7/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#multi-factor-designs

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by Santosh Anand5.1k

Thanks Santosh! Would DESeq2 accept FPKM values as input? Or only raw read count?

ADD REPLYlink written 2.4 years ago by Vanilla80

Almost all the s/w for DiffExp, including DESeq2, require raw read counts for the statistical model to work correctly.

ADD REPLYlink written 2.4 years ago by Santosh Anand5.1k

What if I only have FPKM values? Could I take log of them and remove low expressed genes (to make the distribution approximate to normal)?

ADD REPLYlink written 2.4 years ago by Vanilla80

You can't simply un-normalize the data. Some tools do diffexp with fpkm, but you need to have original depth of libraries. And RNAseq count data doesn't follow Gaussian, instead they are modelled more as negative binomial. For a detailed discussion, see this Reddit

https://www.reddit.com/r/bioinformatics/comments/3bx3em/fpkm_vs_raw_read_count_for_differential/

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by Santosh Anand5.1k

Got it. Thanks Santosh!

ADD REPLYlink written 2.4 years ago by Vanilla80

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by WouterDeCoster43k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 787 users visited in the last hour