Question

Different mapping results between HISAT and TopHat

0

Entering edit mode

3.8 years ago

concetta ▴ 10

Hi all!

I am performing a genome-guided transcriptome assembly with Stringtie.

Before the transcriptome assembly, I have mapped my paired-end reads of the same sample with HISAT2 and TopHat2 and for each mapping I performed the transcriptome assembly.

In my transcriptome assembly I noticed that in a region, Stingtie assembled only one transcript with HISAT mapped reads, while Stringtie assembled two transcripts with TopHat mapped reads. When I check this region, I noticed that HISAT mapped both mates of fragments and some mate are spliced-aligned supporting the reconstructed isofom. In the same region TopHat mapped only one mate of the fragment and the other mate is unmapped, so I did not have the spliced mate and I obtained two different isoforms instead of only one isoforms as HISAT.

Subsequently I performed the same analysis with a subset of reads of the same sample. I noticed that TopHat aligned in the same region both mate of the same fragment obtaining one isoform like the first mapping with HISAT.

I was wondering why TopHat did not map one mate of the fragment when HISAT is able to map both mates. In addition, how it can be explained that when I performed the analysis on a subset of reads TopHat is able to map both mates as HISAT.

Thank you,

Concetta

RNA-Seq alignment • 1.2k views

ADD COMMENT • link 3.8 years ago by concetta ▴ 10

1

Entering edit mode

Look, these kinds of comparisons are not trivial and I actually recommend not to grind your head over deprecated tools like tophat. tophat os old so is tophat2 and hisat. Hisat2 and STAR are currently considered to go-to tools for RNA-seq mapping. BBmap and Rsubread are probably fine as well. Don't waste your time on old stuff.

ADD REPLY • link 3.8 years ago by ATpoint 81k

0

Entering edit mode

Thank you for your answer. I am tying to understand TopHat results because some years ago I performed the the transcriptome assembly with TopHat + StringTie. Now I analysed the same data with Hisat + Stringtie and I noticed that the result change.

Howewer, the most strange thing that I noticed as I mentioned in the previous post, it is the different TopHat mapping results with the entire fastq file and with a subset of reads of the entire file. Do you have a sort of explanation of this TopHat behavior?

ADD REPLY • link 3.8 years ago by concetta ▴ 10