Question: DEG detection with datasets generated with different sequencing protocol
0
gravatar for Vanilla
5 weeks ago by
Vanilla50
Hong Kong
Vanilla50 wrote:

Hi all!

I'm going to detect differentially expressed genes with RNA-seq data got from some "GFP positive cells" and "GFP negative cells". However, the cDNA are sequenced with two different methods, one as "normal" RNA-seq, and the other is low-input RNA-seq (only requires a small amount of starting materials). Here's the summary of number of dataset I got in each cell type with each method:

GFP positive cells * normal RNA-seq : 1

GFP negative cells * normal RNA-seq : 1

GFP positive cells * low-input RNA-seq : 3

GFP negative cells * low-input RNA-seq : 2

In such a case, what kind of statistics/tools can be applied to detect DEGs in GFP positive cells VS GFP negative cells?

Thanks all!

statistics rna-seq degs • 177 views
ADD COMMENTlink modified 4 weeks ago by Santosh Anand3.4k • written 5 weeks ago by Vanilla50
3
gravatar for Santosh Anand
4 weeks ago by
Santosh Anand3.4k
Santosh Anand3.4k wrote:

You need to run a two factor design, the factors being GFP +/- status and low-input/normal sequencing. See http://www.bioconductor.org/packages/3.7/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#multi-factor-designs

ADD COMMENTlink modified 4 weeks ago • written 4 weeks ago by Santosh Anand3.4k

Thanks Santosh! Would DESeq2 accept FPKM values as input? Or only raw read count?

ADD REPLYlink written 4 weeks ago by Vanilla50

Almost all the s/w for DiffExp, including DESeq2, require raw read counts for the statistical model to work correctly.

ADD REPLYlink written 4 weeks ago by Santosh Anand3.4k

What if I only have FPKM values? Could I take log of them and remove low expressed genes (to make the distribution approximate to normal)?

ADD REPLYlink written 4 weeks ago by Vanilla50

You can't simply un-normalize the data. Some tools do diffexp with fpkm, but you need to have original depth of libraries. And RNAseq count data doesn't follow Gaussian, instead they are modelled more as negative binomial. For a detailed discussion, see this Reddit

https://www.reddit.com/r/bioinformatics/comments/3bx3em/fpkm_vs_raw_read_count_for_differential/

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Santosh Anand3.4k

Got it. Thanks Santosh!

ADD REPLYlink written 4 weeks ago by Vanilla50

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by WouterDeCoster26k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 703 users visited in the last hour