Tophat fusions no results
Entering edit mode
7.9 years ago
thomas.smith2 ▴ 120

Hi all,

I'm having what seems to be a common problem with reported fusion results. There are a few threads discussing this but none seem relevant to my problem.

Looking For Reasons Of Why The Results Of A Tophat Fusion Post Is Empty

From the other threads, it's clear the structure of the directory is very important, specifically the samples must be labelled "tophat_[sample_name#1]" and the blast directory should be called blast not blast_human as indicated in the manual.

Looking through the code, I see these lines under the read_fusion_genes function:

for sample_name in sample_names:
        sample_isoform_filename = "tophat_" + sample_name + "/transfuse.txt"
        if not os.path.exists(sample_isoform_filename):

So tophat-fusion-post seems to be expecting a "transfuse.txt" file in the sample directory. Here's my sample directory which I copied over from running Tophat2 --fusion-search ... :

-rw-rw-r-- 1 toms projects 332K May 14 19:06 insertions.bed
-rw-rw-r-- 1 toms projects  13M May 14 19:06 junctions.bed
-rw-rw-r-- 1 toms projects 142M May 14 19:06 unmapped.bam
-rw-rw-r-- 1 toms projects  569 May 14 19:06 align_summary.txt
-rw-rw-r-- 1 toms projects 6.0G May 14 19:07 accepted_hits.bam
-rw-rw-r-- 1 toms projects  184 May 14 19:07
-rw-rw-r-- 1 toms projects 353K May 14 19:07 deletions.bed
drwxrwsr-x 2 toms projects  894 May 14 19:07 logs/
-rw-rw-r-- 1 toms projects  48M May 14 19:07 fusions.out

No transfuse.txt file!

Has anyone run fusion-post successfully after Tophat2 (v. 2.0.13)? Can you show me the contents of your sample directory.

Thanks in advance.


tophat-fusion-post -p 2 --num-fusion-reads 1 --num-fusion-pairs 2 --num-fusion-both 5 bowtie_indexes/hg38


[Fri May 15 11:27:04 2015] Beginning TopHat-Fusion post-processing run (v2.0.13)
[Fri May 15 11:27:04 2015] Extracting 23-mer around fusions and mapping them using Bowtie
[Fri May 15 11:27:51 2015] Filtering fusions
    Processing: tophat_sample/fusions.out
    0 fusions are output in ./tophatfusion_out/potential_fusion.txt
[Fri May 15 11:27:57 2015] Blasting 50-mers around fusions
[Fri May 15 11:27:57 2015] Generating read distributions around fusions
[Fri May 15 11:27:57 2015] Reporting final fusion candidates in html format
    num of fusions: 0
[Fri May 15 11:27:57 2015] Run complete [00:00:53 elapsed]


drwxrwsr-x 3 toms projects  278 May 14 19:07 tophat_sample/
-rwxrwxr-x 1 toms projects  38M May 14 19:11 ensGene.txt*
-rwxrwxr-x 1 toms projects 7.2M May 14 19:12 ensGtp.txt*
-rwxrwxr-x 1 toms projects 398K May 14 19:12 mcl*
-rwxrwxr-x 1 toms projects  11M May 14 19:22 refGene_sorted.txt*
drwxrwsr-x 2 toms projects 1.2K May 14 20:35 blast/
drwxrwsr-x 2 toms projects  262 May 15 11:08 bowtie_indexes/
drwxrwsr-x 7 toms projects  359 May 15 11:16 tophatfusion_out/
tophat tophat-fusion RNA-Seq • 3.0k views
Entering edit mode

For anyone who's interested, it looks like the problem stemmed from not having all the necessary blast databases in the blast directory - it was nothing to do with the "transfuse.txt" file. I'd failed to follow the instructions in the manual to include three blast databases:

As an aside, the tophat-fusion-post script could really do with a re-scripting to make it fail elegantly. There's a whole load of if:continue statements that can cause the script to output no fusions without the user receiving any error message to save a file is missing etc.


Login before adding your answer.

Traffic: 3165 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6