Question

Bowtie2 low mapping percentage

0

Entering edit mode

8.7 years ago

prp291 ▴ 70

I know this type of questions have been asked before in this and other forums. But I am facing little bit different situation here.

I have 50 bp SE reads from a recently sequenced plant genome and want to use them for differential expression. I am using the bowtie2 for mapping.

I used this command for bowtie2 mapping

bowtie2 --local -D 15 -R 2 -L 22 -N 1 -x ~/bin/bowtie-index/bowtie2/plant_genome input1.fq -S genome.sam

First of all I used the genome to map my reads and I got this :

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1653410 (13.78%) aligned 0 times
    3778924 (31.50%) aligned exactly 1 time
    6564702 (54.72%) aligned >1 times
86.22% overall alignment rate

Then I used the transcripts sequence provided by groups which recently sequence this plant genome and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    7322107 (61.03%) aligned 0 times
    3739726 (31.17%) aligned exactly 1 time
    935203 (7.80%) aligned >1 times
38.97% overall alignment rate

Finally I used the previously reported contigs (assembled from RNAseq data) and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1386832 (11.56%) aligned 0 times
    3364705 (28.05%) aligned exactly 1 time
    7245499 (60.39%) aligned >1 times
88.44% overall alignment rate

Can anyone suggest me that what can be the probable reason that alignment percentage is very low when I used the transcript originally annotated from genome? How can I increase the alignment percentage?

Note:

Reads are demultiplexed and quality filtered for adapters and low quality reads
Fastqc report can be found HERE

next-gen bowtie RNA-seq • 3.8k views

ADD COMMENT • link updated 19 months ago by Ram 43k • written 8.7 years ago by prp291 ▴ 70

score 0 · Answer 1 · 2015-08-21

0

Entering edit mode

8.7 years ago

JC 13k

A simple explanation: the predicted transcript from the genome are bad predictions. So you're missing a lot of genes/transcripts that failed to be detected in your mapping. Compare how much sequences are shared between predicted transcripts and RNAseq assembled contigs.

ADD COMMENT • link 8.7 years ago by JC 13k

0

Entering edit mode

Thanks for your response. Yes I assume so but when I map other libraries downloaded from NCBI SRA then mapping percentage is above 60%. That is my concern

ADD REPLY • link 8.7 years ago by prp291 ▴ 70