Question

Differential expression analysis suggestions needed

0

Entering edit mode

5.7 years ago

sierraallinone ▴ 20

We have a trinity transcriptome assembly for 6 muscles types, 3 of which have triplicate data. We generated a supertranscript but now I’m reading conflicting things on how to approach the differential analysis portion of this experiment.

We have no reference genome so if we go with HISAT2 —> cufflinks —> Cuffdiff —> cummerbund wed of course use the supertranscript as the reference for which the reads align back to and go from there.

We also toyed with the idea of using DESEq/EdgeR

Then it was suggested to do hisat2–> stringtie —> ballgown but then someone says that those programs are good for genome assisted analyses while salmon/kallisto —> sleuth is transcript level analysis. I’ve looked into this type of pseudo alignment but at this point I just really don’t know which pipeline to use. Furthermore, some people think transcript level analysis is better than gene level analysis but it seems that gene level analysis is more reliable.

RNA-Seq • 1.8k views

ADD COMMENT • link 5.7 years ago by sierraallinone ▴ 20

0

Entering edit mode

https://cgatoxford.wordpress.com/2016/08/17/why-you-should-stop-using-featurecounts-htseq-or-cufflinks2-and-start-using-kallisto-salmon-or-sailfish/

ADD REPLY • link 5.7 years ago by Arup Ghosh 3.2k

1

Entering edit mode

Hmm interesting.. a hybrid approach. I noticed they mentioned the alignment independent method works really well for unique transcripts. Just how unique? Because my transcripts are most likely going to be pretty similar with a few different alternative splicing zones of course it’s hard to tell since there’s no reference genome and we really only have one muscle type sequences from one clone as our initial rna reference. So does that mean I should maybe stick to the cufflinks pipeline?

ADD REPLY • link 5.7 years ago by sierraallinone ▴ 20

0

Entering edit mode

You should avoid using cufflinks & tophat, any other analysis flow will be fine.

Please stop using Tophat https://t.co/Es4ohxOEyx Cole and I developed the method in *2008*. It was greatly improved in TopHat2 then HISAT & HISAT2. There is no reason to use it anymore. I have been saying this for years yet it has more citations this year than last #methodsmatter
— Lior Pachter (@lpachter) December 2, 2017

ADD REPLY • link 5.7 years ago by Arup Ghosh 3.2k

1

Entering edit mode

Generate the reference transcriptome from the data using HISAT2 and StringTie (or Trinity)
Use this reference transcriptome with your original FASTQs for the purposes of determining read count abundances over your identified transcripts using Kallisto / Salmon
conduct differential expression analysis sing EdgeR / DESeq2

...or, from #1, use prepDE.py (bundled with StringTie, I believe) to immediately generate count data suitable for #3

ADD REPLY • link 5.7 years ago by Kevin Blighe 87k