Question

alignment rate in bowtie2

1

Entering edit mode

8.2 years ago

zizigolu ★ 4.3k

Hi,

I used bowtie2 to map GSE69414 data sets but rate of alignment is 0 or max 0.02

How is that possible? Am I doing something wrong?

[izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046311-ribo_trimmed.fastq -S SRR2046311-ribo_trimmed.sam
46386847 reads; of these:
  46386847 (100.00%) were unpaired; of these:
    46375320 (99.98%) aligned 0 times
    10406 (0.02%) aligned exactly 1 time
    1121 (0.00%) aligned >1 times
0.02% overall alignment rate
[izadi@lbox161 bowtie2-2.2.5]$
[izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046322-mRNA_trimmed.fastq -S SRR2046322-mRNA_trimmed.fastq
0 reads
0.00% overall alignment rate
[izadi@lbox161 bowtie2-2.2.5]$

What is the reason please?

Thank you

sequencing RNA-Seq • 5.0k views

ADD COMMENT • link updated 23 months ago by Ram 43k • written 8.2 years ago by zizigolu ★ 4.3k

1

Entering edit mode

hi,

It seems you are aligning paired-end RNA-seq using bowtie. Maybe use splice-aware aligner like TopHat or STAR. mRNA seq reads would have gapped alignment (due to intervening introns) which bowtie would fail to recognize in most cases. Still you should have some % of reads aligning. Not sure what exactly is reason for 0%. Try splice-aware aligners and see

ADD REPLY • link 8.2 years ago by Amitm ★ 2.2k

0

Entering edit mode

thank you, but I think data sets is not paired-end

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Yea, my bad. The bowtie report already says so.

ADD REPLY • link 8.2 years ago by Amitm ★ 2.2k

1

Entering edit mode

How many reads in your fastq file? (cat SRR2046322-mRNA_trimmed.fastq | wc -l)

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by andrew.j.skelton73 6.5k

0

Entering edit mode

thank you,

[izadi@lbox161 bowtie2-2.2.5]$ cat SRR2046322-mRNA_trimmed.fastq | wc -l
6694

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by zizigolu ★ 4.3k

0

Entering edit mode

Is this Illumina sequencing? 4 lines per read is 6694/4 = 1673.5 (counting for an out by one error in wc). 1.7K Reads in raw fastq seems suspicious.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by andrew.j.skelton73 6.5k

1

Entering edit mode

This. Either sequencing coverage was extremely low (too many libraries per effort or something) or something is screwed up in preprocessing. It is hard to diagnose anything else with this few reads.

EDIT: Actually, something else is wrong. Bowtie reports 46386847 reads above. Why the discrepancy? A previous comment was correct in that you should be using a splice-aware aligner.

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by Brice Sarver ★ 3.8k

0

Entering edit mode

actually I am mapping the reads on yeast coding sequence. I mean first I indexed cds fasta then mapped the reads on but the rate of alignment was zero. might this is the reason???

my supervisor believed that because yeast is Eukaryota, we can use bowtie but on cds instead of whole genome fasta this this is like this we are using tophat. I don't if he is right or not but the alignment rate is 0 although when I tried another data sets everything was normal. I mean for some data sets alignment rate is 0 and for another is the same as I mapped on whole genome fasta

ADD REPLY • link updated 4.3 years ago by Ram 43k • written 8.2 years ago by zizigolu ★ 4.3k

1

Entering edit mode

You want to map to the Saccharomyces genome using TopHat (which uses Bowtie under the hood anyway). Depending on how your 'CDS reference' is set up (exons? transcripts? UTRs?), you are almost certainly losing a lot of reads due to mismatches or a failure of global alignment. If you wanted to mess around with this, you can try mapping using --local or --end-to-end to see if reads are being discarded because of mismatches, but you still ought to process your RNAseq data following a more standard series of steps.