Question

Reducing error bars on TopHat RNA expression analysis

0

Entering edit mode

7.6 years ago

waterspring • 0

Using Tophat/Bowtie, I'm producing an RNA expression (FPKM) plot of a paralog in three different samples. The problem is the first specimen (blue bar in the image) has a huge error bar (over 12 FPKM in the image). I'm trying to figure out why this is so large, and how if possible, to reduce it.

I've been trying many different tophat parameters to see if there is any change... i.e. --b2-sensitive --transcriptome-only --no-novel-juncs, etc... Does anyone have any suggestions what parameters might help? (scoring or alignment options?)

Here is my pipeline. I'm then using the R library cummeRbund to produce the expression bar plot.

bowtie2-build quiver.fa quiver

tophat -p 8 -G quiver.gtf -o tophat_out_1 quiver PK_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_2 quiver PK_flower_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_3 quiver PK_root_RNA.fq

cufflinks -p 8 -o cufflinks_out_1 tophat_out_1/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_2 tophat_out_2/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_3 tophat_out_3/accepted_hits.bam

cuffmerge -g quiver.gtf -s quiver.fa -p 8 assemblies.txt

cuffdiff -o diff_out -b quiver.fa -p 8 -L pk_1,pk_2,pk_3 -u merged_asm/merged.gtf tophat_out_1/accepted_hits.bam tophat_out_2/accepted_hits.bam tophat_out_3/accepted_hits.bam

Thanks for your help! enter image description here

RNA-Seq R tophat bowtie • 1.6k views

ADD COMMENT • link updated 7.3 years ago by Biostar 20 • written 7.6 years ago by waterspring • 0

0

Entering edit mode

Not the answer you are looking for, but you should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLY • link 7.6 years ago by WouterDeCoster 48k

0

Entering edit mode

In Cufflinks, you may be able to control this with the –max-bundle-frags command line parameter. However, as Wouter mentions, Tophat/Cufflinks are in the past and one should move to HISAT2/StringTie.

In addition, FPKM does not deal very well with high counts, generally speaking, and produces extra-ordinary fold change values as a result. It neither normalises across samples and, therefore, it's unreliable to begin to statistically compare samples that have FPKM normalised counts. Unless you're analysing just a single sample, I would recommend geometric normalisation, if that's available in StringTie.