Question: alignment rate in bowtie2
1
gravatar for F
3.3 years ago by
F3.4k
Iran
F3.4k wrote:

hi,

I used bowtie2 to map GSE69414 data sets but rate of alignment is 0 or max 0.02

how possible? am I doing something wrong?

[izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046311-ribo_trimmed.fastq -S SRR2046311-ribo_trimmed.sam
46386847 reads; of these:
  46386847 (100.00%) were unpaired; of these:
    46375320 (99.98%) aligned 0 times
    10406 (0.02%) aligned exactly 1 time
    1121 (0.00%) aligned >1 times
0.02% overall alignment rate

[izadi@lbox161 bowtie2-2.2.5]$ [izadi@lbox161 bowtie2-2.2.5]$ bowtie2 -x Saccharomyces_cerevisiae -U SRR2046322-mRNA_trimmed.fastq -S SRR2046322-mRNA_trimmed.fastq
0 reads
0.00% overall alignment rate
[izadi@lbox161 bowtie2-2.2.5]$ 

what is the reason please?

thank you

 

 

sequencing rna-seq myposts • 2.2k views
ADD COMMENTlink written 3.3 years ago by F3.4k
1

hi,

It seems you are aligning paired-end RNA-seq using bowtie. Maybe use splice-aware aligner like TopHat or STAR. mRNA seq reads would have gapped alignment (due to intervening introns) which bowtie would fail to recognize in most cases. Still you should have some % of reads aligning. Not sure what exactly is reason for 0%. Try splice-aware aligners and see

ADD REPLYlink written 3.3 years ago by Amitm1.6k

thank you, but I think data sets is not paired-end

 

ADD REPLYlink written 3.3 years ago by F3.4k

Yea, my bad. The bowtie report already says so.

ADD REPLYlink written 3.3 years ago by Amitm1.6k
1

How many reads in your fastq file? (cat SRR2046322-mRNA_trimmed.fastq | wc -l)

ADD REPLYlink written 3.3 years ago by andrew.j.skelton735.7k

thank you,

[izadi@lbox161 bowtie2-2.2.5]$ cat SRR2046322-mRNA_trimmed.fastq | wc -l
6694

 

ADD REPLYlink written 3.3 years ago by F3.4k

Is this Illumina sequencing? 4 lines per read is 6694/4 = 1673.5 (counting for an out by one error in wc). 1.7K Reads in raw fastq seems suspicious. 

ADD REPLYlink written 3.3 years ago by andrew.j.skelton735.7k
1

This. Either sequencing coverage was extremely low (too many libraries per effort or something) or something is screwed up in preprocessing. It is hard to diagnose anything else with this few reads.

EDIT: Actually, something else is wrong. Bowtie reports 46386847 reads above. Why the discrepancy? A previous comment was correct in that you should be using a splice-aware aligner.

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by Brice Sarver2.6k

actually I am mapping the reads on yeast coding sequence. I mean first I indexed cds fasta then mapped the reads on but the rate of alignment was zero. might this is the reason???

my supervisor believed that because yeast is Eukaryota, we can use bowtie but on cds instead of whole genome fasta this this is like this we are using tophat. I don't if he is right or not but the alignment rate is 0 although when i tried another data sets everything was normal. i mean for some data sets alignment rate is 0 and for another is the same as i mapped on whole genome fasta   

ADD REPLYlink modified 3.3 years ago • written 3.3 years ago by F3.4k
1

You want to map to the Saccharomyces genome using TopHat (which uses Bowtie under the hood anyway). Depending on how your 'CDS reference' is set up (exons? transcripts? UTRs?), you are almost certainly losing a lot of reads due to mismatches or a failure of global alignment. If you wanted to mess around with this, you can try mapping using --local or --end-to-end to see if reads are being discarded because of mismatches, but you still ought to process your RNAseq data following a more standard series of steps.

ADD REPLYlink written 3.3 years ago by Brice Sarver2.6k

yes it is illumina

ADD REPLYlink written 3.3 years ago by F3.4k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1782 users visited in the last hour