Bowtie2 low mapping percentage
1
0
Entering edit mode
8.7 years ago
prp291 ▴ 70

I know this type of questions have been asked before in this and other forums. But I am facing little bit different situation here.

I have 50 bp SE reads from a recently sequenced plant genome and want to use them for differential expression. I am using the bowtie2 for mapping.

I used this command for bowtie2 mapping

bowtie2 --local -D 15 -R 2 -L 22 -N 1 -x ~/bin/bowtie-index/bowtie2/plant_genome input1.fq -S genome.sam

First of all I used the genome to map my reads and I got this :

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1653410 (13.78%) aligned 0 times
    3778924 (31.50%) aligned exactly 1 time
    6564702 (54.72%) aligned >1 times
86.22% overall alignment rate

Then I used the transcripts sequence provided by groups which recently sequence this plant genome and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    7322107 (61.03%) aligned 0 times
    3739726 (31.17%) aligned exactly 1 time
    935203 (7.80%) aligned >1 times
38.97% overall alignment rate

Finally I used the previously reported contigs (assembled from RNAseq data) and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1386832 (11.56%) aligned 0 times
    3364705 (28.05%) aligned exactly 1 time
    7245499 (60.39%) aligned >1 times
88.44% overall alignment rate

Can anyone suggest me that what can be the probable reason that alignment percentage is very low when I used the transcript originally annotated from genome? How can I increase the alignment percentage?

Note:

  • Reads are demultiplexed and quality filtered for adapters and low quality reads
  • Fastqc report can be found HERE
next-gen bowtie RNA-seq • 3.8k views
ADD COMMENT
0
Entering edit mode
8.7 years ago
JC 13k

A simple explanation: the predicted transcript from the genome are bad predictions. So you're missing a lot of genes/transcripts that failed to be detected in your mapping. Compare how much sequences are shared between predicted transcripts and RNAseq assembled contigs.

ADD COMMENT
0
Entering edit mode

Thanks for your response. Yes I assume so but when I map other libraries downloaded from NCBI SRA then mapping percentage is above 60%. That is my concern

ADD REPLY

Login before adding your answer.

Traffic: 2608 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6