Reducing error bars on TopHat RNA expression analysis
0
0
Entering edit mode
6.4 years ago

Using Tophat/Bowtie, I'm producing an RNA expression (FPKM) plot of a paralog in three different samples. The problem is the first specimen (blue bar in the image) has a huge error bar (over 12 FPKM in the image). I'm trying to figure out why this is so large, and how if possible, to reduce it.

I've been trying many different tophat parameters to see if there is any change... i.e. --b2-sensitive --transcriptome-only --no-novel-juncs, etc... Does anyone have any suggestions what parameters might help? (scoring or alignment options?)

Here is my pipeline. I'm then using the R library cummeRbund to produce the expression bar plot.

bowtie2-build quiver.fa quiver

tophat -p 8 -G quiver.gtf -o tophat_out_1 quiver PK_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_2 quiver PK_flower_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_3 quiver PK_root_RNA.fq

cufflinks -p 8 -o cufflinks_out_1 tophat_out_1/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_2 tophat_out_2/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_3 tophat_out_3/accepted_hits.bam

cuffmerge -g quiver.gtf -s quiver.fa -p 8 assemblies.txt

cuffdiff -o diff_out -b quiver.fa -p 8 -L pk_1,pk_2,pk_3 -u merged_asm/merged.gtf tophat_out_1/accepted_hits.bam tophat_out_2/accepted_hits.bam tophat_out_3/accepted_hits.bam

Thanks for your help! enter image description here

RNA-Seq R tophat bowtie • 1.5k views
ADD COMMENT
0
Entering edit mode

Not the answer you are looking for, but you should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLY
0
Entering edit mode

In Cufflinks, you may be able to control this with the –max-bundle-frags command line parameter. However, as Wouter mentions, Tophat/Cufflinks are in the past and one should move to HISAT2/StringTie.

In addition, FPKM does not deal very well with high counts, generally speaking, and produces extra-ordinary fold change values as a result. It neither normalises across samples and, therefore, it's unreliable to begin to statistically compare samples that have FPKM normalised counts. Unless you're analysing just a single sample, I would recommend geometric normalisation, if that's available in StringTie.

ADD REPLY
0
Entering edit mode

Did you look at the extreme expression values before analysis?

ADD REPLY

Login before adding your answer.

Traffic: 2815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6