Question

How to choose the best alignment from mapping with bowtie2?

1

Entering edit mode

4.6 years ago

anikcropscience ▴ 270

Hello, I am doing a K-mer base genome wide association mapping. I have some significant K-mers of which I want to find the position in the genome. For that, I have extracted the reads from which each K-mer originated. For example, for one significant K-mer, I have ~30 paired end short reads extracted from one sample fastq file (R1 and R2). Then I used the following code to map those reads against the reference genome with bowtie2:

bowtie2 --very-sensitive-local -p 2 -x ref.genome -1 R1.fastq -2 R2.fastq -S output.sam

Then I performed the necessary steps with samtools to convert and sort it and finally with sambamba remove the unmapped reads. Then I load the bam file in IGV with the reference genome.

My question is, as there are multiple reads for One K-mer, which read should I use to get the location? If there are multiple alignments of the same reads in different locations how should I choose the best one? Is there any mapping score in .bam file that I should look for?

Looking forward to getting some suggestions.

Happy Easter!

alignment bowtie2 sequence • 1.4k views

ADD COMMENT • link 4.6 years ago by anikcropscience ▴ 270

0

Entering edit mode

My question is, as there are multiple reads for One K-mer, which read should I use to get the location? If there are multiple alignments of the same reads in different locations how should I choose the best one? Is there any mapping score in .bam file that I should look for?

k-mers by their characteristic are going to present in more than one read. You did not start with k-mers that were unique to your dataset so how can you expect them to have only one hit. All multi-mapping reads are going to be given a poor mapping score so there is no help in that direction.

I am not sure exactly what you are trying to do here but if you were using bbmap.sh you would have the option of randomly selecting one of the best sites by using ambig=random option.

ambiguous=best          (ambig) Set behavior on ambiguously-mapped reads (with
                        multiple top-scoring mapping locations).
                            best    (use the first best site)
                            toss    (consider unmapped)
                            random  (select one top-scoring site randomly)

ADD REPLY • link 4.6 years ago by GenoMax 154k

0

Entering edit mode

Actually, in the paper that I am following, there they mentioned that they choose the best alignment according to the mapping score. But I really did not understand. I can imagine that there will be multiple reads for one K-mer. That was the case in the reference paper as well. But they chose the location of the K-mer according to the best alignment. So, I was wondering how could I do that. Any idea?

ADD REPLY • link 4.6 years ago by anikcropscience ▴ 270