Question: SAM Flags for illumina paired-end reads mapped on transcriptome
0
gravatar for Macspider
10 weeks ago by
Macspider1.6k
Vienna - BOKU
Macspider1.6k wrote:

I am mapping illumina paired-end reads against a transcriptome with HISAT2 and I'm having a moment of confusion understanding one thing which might be easy actually:

Is it normal to have sam flags that point at the same strand for both reads? The technology output should produce one read that maps on the forward and one read that maps on the reverse strand (when mapping against a genome) but I'm uncertain of what I should observe when mapping against a transcriptome.

I guess the program should map one of the two mates and reverse-complement the other, but does the output flag (and mapping result in general) reflect the mapping strand of the reverse-complement or of the original read? is it normal to have many pairs with both mates on the same strand?

Please help me out of this "theoretical" quicksand.

ADD COMMENTlink modified 10 weeks ago by Devon Ryan73k • written 10 weeks ago by Macspider1.6k
2
gravatar for Devon Ryan
10 weeks ago by
Devon Ryan73k
Freiburg, Germany
Devon Ryan73k wrote:

It doesn't matter what you're aligning against, you expect PE mates to align with opposite orientations. The strandedness of the underlying data plays absolutely no role in this (your reads don't have strands they have orientations). Unless a very atypical library prep. was used, the flags you should commonly see in the resulting BAM file are 99, 147, 83, and 163. If those don't constitute the overwhelming majority of the flags then something likely went amiss.

ADD COMMENTlink written 10 weeks ago by Devon Ryan73k

8 weeks later, the problem rises again in a different form: would you keep 83 and 163? they are the ONLY flags I have after filtering. I find it suspicious because my reads should face each other, they shouldn't map as mate pairs.

ADD REPLYlink written 16 days ago by Macspider1.6k

83 and 163 are facing each other (as are 99 and 147). Having only this combination makes sense if you're aligning against a transcriptome and have strand-specific data.

ADD REPLYlink written 16 days ago by Devon Ryan73k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1273 users visited in the last hour