Question: Extracting the best reads that align multiple times
gravatar for ioannis
3.4 years ago by
ioannis30 wrote:

Hello community,

I am using bowtie2 to align sequences to a reference genome. The results are quite disappointing: 48% of the reads align exactly 1 time and 44% of the reads aligned more than once.

I have single-end reads 55-70bp long. The reference genome is the OreoNil2 (Oreochromis niloticus).

I am not sure about this, but I guess each sequence that aligns multiple times has different score according to how good is the alignment on the reference genome. I would like to extract in a new sam file the reads that align only once (48%) and the reads with the best score among the reads that align multiple times.

Does anybody knows if this is possible and how to do something like that? Do I introduce any bias if I pick those reads?

Thanks in advance!

alignment • 988 views
ADD COMMENTlink written 3.4 years ago by ioannis30

Might be worth trying bwa and comparing results. If these are paired end reads I would expect a smaller proportion of multiple mappings.

ADD REPLYlink modified 3.4 years ago • written 3.4 years ago by abascalfederico1.1k

Once you pick a subset of reads with higher scores, yes, you will introduce bias. What is your ultimate goal?

ADD REPLYlink written 3.4 years ago by Brian Bushnell17k

My goal is to get as much alignments as I can but as it seems, I have to use less than 50% of my total reads. I have hydroxymethylation data and I need coverage, as much as I can get. I will try different aligners just to see if I get better results. However, I think that bowtie2 is a quite good aligner. So, I do not hope for a miracle. Thank you for your input!

ADD REPLYlink written 3.4 years ago by ioannis30

You'll typically get a much higher alignment rate with BBMap compared to Bowtie2, when using data with low identity to the reference. Particularly, you can add the flag "slow" or "vslow", and use a shorter kmer length such as 11, to increase the alignment rate even more.

ADD REPLYlink written 3.4 years ago by Brian Bushnell17k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1585 users visited in the last hour