I know this type of questions have been asked before in this and other forums. But I am facing little bit different situation here.
I have 50 bp SE reads from a recently sequenced plant genome and want to use them for differential expression. I am using the bowtie2 for mapping.
I used this command for bowtie2 mapping
bowtie2 --local -D 15 -R 2 -L 22 -N 1 -x ~/bin/bowtie-index/bowtie2/plant_genome input1.fq -S genome.sam
First of all I used the genome to map my reads and I got this :
11997036 reads; of these:
11997036 (100.00%) were unpaired; of these:
1653410 (13.78%) aligned 0 times
3778924 (31.50%) aligned exactly 1 time
6564702 (54.72%) aligned >1 times
86.22% overall alignment rate
Then I used the transcripts sequence provided by groups which recently sequence this plant genome and I got this
11997036 reads; of these:
11997036 (100.00%) were unpaired; of these:
7322107 (61.03%) aligned 0 times
3739726 (31.17%) aligned exactly 1 time
935203 (7.80%) aligned >1 times
38.97% overall alignment rate
Finally I used the previously reported contigs (assembled from RNAseq data) and I got this
11997036 reads; of these:
11997036 (100.00%) were unpaired; of these:
1386832 (11.56%) aligned 0 times
3364705 (28.05%) aligned exactly 1 time
7245499 (60.39%) aligned >1 times
88.44% overall alignment rate
Can anyone suggest me that what can be the probable reason that alignment percentage is very low when I used the transcript originally annotated from genome? How can I increase the alignment percentage?
Note:
- Reads are demultiplexed and quality filtered for adapters and low quality reads
- Fastqc report can be found HERE
Thanks for your response. Yes I assume so but when I map other libraries downloaded from NCBI SRA then mapping percentage is above 60%. That is my concern