TopHat alignment on Trinity Transcripts giving a lot of unmapped results.
0
0
Entering edit mode
8.2 years ago
kanika.151 ▴ 130

I had run DeNovo Assembly on my data using Trinity. The Assembled .fasta file was then aligned by using TopHat without giving an annotation file. The amount of unmapped.bam is more than expected as it ranges anywhere from 100 MB to 600 MB for different conditions. I have 6 different conditions and I have paired-end data.

My question is it normal to get such high number of unmapped reads?

One of the align_summary.txt:

Left reads:
          Input     :  12382431
           Mapped   :  11331326 (91.5% of input)
            of these:  10276265 (90.7%) have multiple alignments (312919 have >20)
Right reads:
          Input     :  12382431
           Mapped   :  11346906 (91.6% of input)
            of these:  10290863 (90.7%) have multiple alignments (312928 have >20)
91.6% overall read mapping rate.

Aligned pairs:  11146003
     of these:  10125490 (90.8%) have multiple alignments
                   65963 ( 0.6%) are discordant alignments
89.5% concordant pair alignment rate.

Should I be concerned?

Unmapped tophat Trinity • 2.1k views
ADD COMMENT
0
Entering edit mode

You should not look at the file size. Check what percentage of reads are unmapped. From the align_summary, 91% of reads mapped back to the assembled transcriptome.

Note: As you are aligning the data to transcriptome, which might have multiple transcripts assembled for same gene (redundancy), so you get more multi mapped reads.

ADD REPLY
0
Entering edit mode

As Trinity assemblies results in a factor of 3 in my case. I was expecting that some of it will be unmapped but 15-20% of the data is not aligned that raised some flags.

ADD REPLY
0
Entering edit mode

There could be better ways but I would just BLAST few of the unmapped reads and see what are they.

ADD REPLY

Login before adding your answer.

Traffic: 2732 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6