Question: Trouble using TopHat (bowtie index genome.*.bt2l)
gravatar for manuelmendoza
3.0 years ago by
manuelmendoza50 wrote:


I'm just starting to use tophat and I have a little problem which I am not able to solve. I wanna align several human transcriptomes, so I have downloaded the reference human genome ( and now I wanna use tuxedo protocol.

Executing the following commands:

1st. Uncompress the genome:

tar xvfz Homo_sapiens_NCBI_GRCh38.tar.gz

2nd. Make a working directory:

mkdir Alignments

3rd. Create symbolic links to annotation files and bowtie index (inside the working directory):

ln -s /path_to/Hsa38/Annotation/Archives/archive-2015-08-11-09-31-31/Genes/genes.gtf
ln -s /path_to/Hsa38/Sequence/Bowtie2Index/genome.*.

4th. Try to run tophat (inside the working directory):

tophat -p 8 -G genes.gtf -o sample_output --library-type=fr-firststrand genome sample.fq

The output message was the following one:

[2017-11-07 00:29:47] Beginning TopHat run (v2.1.1)
[2017-11-07 00:29:47] Checking for Bowtie
          Bowtie version:
[2017-11-07 00:29:47] Checking for Bowtie index files (genome)..
Error: Could not find Bowtie 2 index files (genome.*.bt2l)

After that I have tried to finde some file with the extension .bt2l: find path_to/Hsa38/ -iname *bt2l but I had not success. Does anyone know where is the index? or how can i solve this trouble?

Thanks in advance.

ADD COMMENTlink modified 3.0 years ago by Kevin Blighe66k • written 3.0 years ago by manuelmendoza50

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLYlink written 3.0 years ago by WouterDeCoster44k

Thaks so much! I've just got the publication =D

ADD REPLYlink written 3.0 years ago by manuelmendoza50

Just look in the actual folder and see what the files are named so you can adjust the symlinks as necessary. Should genome.*., be genome.*? It's not going to recognize the wildcard as is.

ADD REPLYlink written 3.0 years ago by jared.andrews077.5k

I just have tried it but it does not work:

ln -s /path_to/Hsa38/Sequence/Bowtie2Index/genome.*
ln: target '/path_to/Hsa38/Sequence/Bowtie2Index/genome.rev.2.bt2' is not a directory
ADD REPLYlink written 3.0 years ago by manuelmendoza50

Create the symbolic link to the directory, not the file prefix:

ln -s /path_to/Hsa38/Sequence/Bowtie2Index/ symlinkGenome

Then, execute tophat with:

tophat -p 8 -G genes.gtf -o sample_output --library-type=fr-firststrand symlinkGenome/genome sample.fq

As my colleague Wouter has stated, also, tophat/tophat2 is 'retired' and HISAT/HISAT2 is the upgraded version.


ADD REPLYlink written 3.0 years ago by Kevin Blighe66k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 936 users visited in the last hour