Question: Reducing error bars on TopHat RNA expression analysis
gravatar for waterspring
3.1 years ago by
waterspring0 wrote:

Using Tophat/Bowtie, I'm producing an RNA expression (FPKM) plot of a paralog in three different samples. The problem is the first specimen (blue bar in the image) has a huge error bar (over 12 FPKM in the image). I'm trying to figure out why this is so large, and how if possible, to reduce it.

I've been trying many different tophat parameters to see if there is any change... i.e. --b2-sensitive --transcriptome-only --no-novel-juncs, etc... Does anyone have any suggestions what parameters might help? (scoring or alignment options?)

Here is my pipeline. I'm then using the R library cummeRbund to produce the expression bar plot.

bowtie2-build quiver.fa quiver

tophat -p 8 -G quiver.gtf -o tophat_out_1 quiver PK_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_2 quiver PK_flower_RNA.fq

tophat -p 8 -G quiver.gtf -o tophat_out_3 quiver PK_root_RNA.fq

cufflinks -p 8 -o cufflinks_out_1 tophat_out_1/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_2 tophat_out_2/accepted_hits.bam

cufflinks -p 8 -o cufflinks_out_3 tophat_out_3/accepted_hits.bam

cuffmerge -g quiver.gtf -s quiver.fa -p 8 assemblies.txt

cuffdiff -o diff_out -b quiver.fa -p 8 -L pk_1,pk_2,pk_3 -u merged_asm/merged.gtf tophat_out_1/accepted_hits.bam tophat_out_2/accepted_hits.bam tophat_out_3/accepted_hits.bam

Thanks for your help! enter image description here

bowtie rna-seq tophat R • 864 views
ADD COMMENTlink modified 2.8 years ago by Biostar ♦♦ 20 • written 3.1 years ago by waterspring0

Not the answer you are looking for, but you should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLYlink written 3.1 years ago by WouterDeCoster45k

In Cufflinks, you may be able to control this with the –max-bundle-frags command line parameter. However, as Wouter mentions, Tophat/Cufflinks are in the past and one should move to HISAT2/StringTie.

In addition, FPKM does not deal very well with high counts, generally speaking, and produces extra-ordinary fold change values as a result. It neither normalises across samples and, therefore, it's unreliable to begin to statistically compare samples that have FPKM normalised counts. Unless you're analysing just a single sample, I would recommend geometric normalisation, if that's available in StringTie.

ADD REPLYlink written 3.1 years ago by Kevin Blighe69k

Did you look at the extreme expression values before analysis?

ADD REPLYlink written 2.8 years ago by cpad011214k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1555 users visited in the last hour