Question: Bowtie2 low mapping percentage
0
gravatar for prp291
4.0 years ago by
prp29150
United States
prp29150 wrote:

I know this type of questions have been asked before in this and other forums. But I am facing little bit different situation here.

I have 50 bp SE reads from a recently sequenced plant genome and want to use them for differential expression. I am using the bowtie2 for mapping.

I used this command for bowtie2 mapping

bowtie2 --local -D 15 -R 2 -L 22 -N 1 -x ~/bin/bowtie-index/bowtie2/plant_genome input1.fq -S genome.sam

 

First of all I used the genome to map my reads and I got this :

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1653410 (13.78%) aligned 0 times
    3778924 (31.50%) aligned exactly 1 time
    6564702 (54.72%) aligned >1 times
86.22% overall alignment rate

Then I used the transcripts sequence provided by groups which recently sequence this plant genome and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    7322107 (61.03%) aligned 0 times
    3739726 (31.17%) aligned exactly 1 time
    935203 (7.80%) aligned >1 times
38.97% overall alignment rate

Finally I used the previously reported contigs (assembled from RNAseq data) and I got this

11997036 reads; of these:
  11997036 (100.00%) were unpaired; of these:
    1386832 (11.56%) aligned 0 times
    3364705 (28.05%) aligned exactly 1 time
    7245499 (60.39%) aligned >1 times
88.44% overall alignment rate

 

Can anyone suggest me that what can be the probable reason that alignment percentage is very low when I used the transcript originally annotated from genome?? How can I increase the alignment percentage??

Note:

  • reads are demultiplexed and quality filtered for adapters and low quality reads
  • Fastqc report can be found HERE

 

 

 

 

bowtie next-gen rnaseq • 2.3k views
ADD COMMENTlink modified 4.0 years ago by JC8.2k • written 4.0 years ago by prp29150
0
gravatar for JC
4.0 years ago by
JC8.2k
Mexico
JC8.2k wrote:

A simple explanation: the predicted transcript from the genome are bad predictions. So you're missing a lot of genes/transcripts that failed to be detected in your mapping. Compare how much sequences are shared between predicted transcripts and RNAseq assembled contigs.

ADD COMMENTlink written 4.0 years ago by JC8.2k

Thanks for your response. Yes I assume so but when I map other libraries downloaded from NCBI SRA then mapping percentage is above 60%. That is my concern

ADD REPLYlink written 4.0 years ago by prp29150
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1638 users visited in the last hour