Question: DEG detection with datasets generated with different sequencing protocol
0
gravatar for Vanilla
10 months ago by
Vanilla80
Hong Kong
Vanilla80 wrote:

Hi all!

I'm going to detect differentially expressed genes with RNA-seq data got from some "GFP positive cells" and "GFP negative cells". However, the cDNA are sequenced with two different methods, one as "normal" RNA-seq, and the other is low-input RNA-seq (only requires a small amount of starting materials). Here's the summary of number of dataset I got in each cell type with each method:

GFP positive cells * normal RNA-seq : 1

GFP negative cells * normal RNA-seq : 1

GFP positive cells * low-input RNA-seq : 3

GFP negative cells * low-input RNA-seq : 2

In such a case, what kind of statistics/tools can be applied to detect DEGs in GFP positive cells VS GFP negative cells?

Thanks all!

statistics rna-seq degs • 400 views
ADD COMMENTlink modified 10 months ago by Santosh Anand4.1k • written 10 months ago by Vanilla80
3
gravatar for Santosh Anand
10 months ago by
Santosh Anand4.1k
Santosh Anand4.1k wrote:

You need to run a two factor design, the factors being GFP +/- status and low-input/normal sequencing. See http://www.bioconductor.org/packages/3.7/bioc/vignettes/DESeq2/inst/doc/DESeq2.html#multi-factor-designs

ADD COMMENTlink modified 10 months ago • written 10 months ago by Santosh Anand4.1k

Thanks Santosh! Would DESeq2 accept FPKM values as input? Or only raw read count?

ADD REPLYlink written 10 months ago by Vanilla80

Almost all the s/w for DiffExp, including DESeq2, require raw read counts for the statistical model to work correctly.

ADD REPLYlink written 10 months ago by Santosh Anand4.1k

What if I only have FPKM values? Could I take log of them and remove low expressed genes (to make the distribution approximate to normal)?

ADD REPLYlink written 10 months ago by Vanilla80

You can't simply un-normalize the data. Some tools do diffexp with fpkm, but you need to have original depth of libraries. And RNAseq count data doesn't follow Gaussian, instead they are modelled more as negative binomial. For a detailed discussion, see this Reddit

https://www.reddit.com/r/bioinformatics/comments/3bx3em/fpkm_vs_raw_read_count_for_differential/

ADD REPLYlink modified 10 months ago • written 10 months ago by Santosh Anand4.1k

Got it. Thanks Santosh!

ADD REPLYlink written 10 months ago by Vanilla80

If an answer was helpful you should upvote it, if the answer resolved your question you should mark it as accepted.
Upvote|Bookmark|Accept

ADD REPLYlink modified 10 months ago • written 10 months ago by WouterDeCoster34k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1878 users visited in the last hour