Low mapping rate for Tophat2
1
0
Entering edit mode
6.5 years ago

Hi,

I'm somewhat new to bioinformatics, so please bear with me. I'm running tophat2 on some fastq files using the HG38 as reference. This is the command that I ran: tophat2 --b2-sensitive -G /home/fadhil/hg38_ref/lib/hg38.refGene.gtf -p 16 -o /home/data/mcf10_tophat_output /home/fadhil/Bowtie2Index/genome ./SRR925720_mcf10a.fastq

It takes about 8 hours, but in the end the mapping rate is almost 0%, it maps 3997 out of 31898079 reads. I'm not sure I understand why this is happening, although tophat emitted the following error consecutively as it was running:

Warning: Encountered reference sequence with only gaps

Ignoring any potential errors with the fastq files themselves, what could possibly be the problem here?

RNA-Seq software error genome • 1.9k views
ADD COMMENT
1
Entering edit mode

You should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLY
2
Entering edit mode
6.5 years ago
Warning: Encountered reference sequence with only gaps

Have you checked if your reference genome is correct ? Locate the fasta file in

ls /home/fadhil/Bowtie2Index/genome

Lets say it's called "genome.fa". Then check if it looks good :

head /home/fadhil/Bowtie2Index/genome.fa

If it is ok, you can try to rebuild the index

bowtie2-build /home/fadhil/Bowtie2Index/genome.fa /home/fadhil/Bowtie2Index/genome

hope this helps.

ADD COMMENT
0
Entering edit mode

The files in /home/fadhil/Bowtie2Index/genome are all .bt2 files and not fasta. I'm not too sure if this makes any difference.

ADD REPLY
0
Entering edit mode

The bowtie index was build using fasta files, or where did you get the index?

ADD REPLY
0
Entering edit mode

I would suggest that you download the reference in fasta and rebuild the index with the above command. Your reference and indexes seem corrupted.

ADD REPLY
0
Entering edit mode

In the end rebuilding indices helped a bit but I was still getting low mapping rate. Turns out the fastq files were corrupted and re-downloading them helped. Thanks for everyone's help!

ADD REPLY

Login before adding your answer.

Traffic: 1707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6