Question: Finding alignment score for multi-read frombam/sam file
0
gravatar for EVR
2.9 years ago by
EVR540
Earth
EVR540 wrote:

Hi,

I have mapped my RNA seq reads to the genome using tophat2. Thought eh mapping rate was 80%, it has been reported ~35 % has multiple alignments. By default, for multi-read(reads aligning to multiple locations), based on the alignment score, the best read is selected. For multi-read with same alignment score, tophat will report random alignment.

I would like to find how many multi-read has same alignment score from bam/sam file.

Also is there any possibility to assign the multi-read with same alignment score to best location.

Kindly guide me

rna-seq sam bam tophat multi-read • 1.0k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by EVR540
2

You did not "assemble", you "mapped". And it is impossible to assign multi-mapping reads with the same alignment score to the best location, because the locations have the same alignment score. You can either assign them to all, assign them to one at random, or ignore them. For RNA-seq, I think random is typically best, though it depends on how you plan to do post-processing.

ADD REPLYlink written 2.9 years ago by Brian Bushnell16k

Thank you your reply. How can I find how many reads in the bam/sam file which has mapped at multiple positions and has same score. Is there any field in BAM/SAM file which denotes the mult-read and their alignment score.

ADD REPLYlink written 2.9 years ago by EVR540

If a read maps to multiple positions with the same score, it should be assigned a mapq of 3 or less. So, filtering by mapq should do the trick.

ADD REPLYlink written 2.9 years ago by Brian Bushnell16k

Hi Brian,

it is impossible to assign multi-mapping reads with the same alignment score to the best location

I agree with your comment. However, I think you can have the same alignment score and still assign a "best" location. This could happen since typically the alignment score accounts only for the number of matches and mismatches while the probability of incorrect mapping (i.e. the mapq) accounts for match/mismatch but also for base qualities. (Did I get it right?)

ADD REPLYlink written 2.9 years ago by dariober10k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1390 users visited in the last hour