TLEN value is negative for both paired reads
1
0
Entering edit mode
4.0 years ago
nmargu • 0

Hello all, I am making alignment of pair-end reads after Illumina sequencing 2x150 but my DNA fragments are mostly shorter. I assume because of the flags that they are correctly mapped. After trimming and using bowtie-2 I have noticed that the majority of my pairs have the same TLEN negative value. And also, in the sequence column (SEQ, column 10) the sequence is exactly the same. As I understood from TLEN, the leftmost segment receives the + and the rightmost segment receives the - but according to SAM manual "If segments cover the same coordinates then the choice of which is leftmost and rightmost is arbitrary, but the two ends must still have differing signs". Assuming my fragments are in this scenario, they have the same sequence but they always receive the negative sign. Is this normal? And also, regarding the sequence, why the sequence is the same in those cases? I need to retrieve from SAM the exact sequence that was aligned from each read (pair1 and pair2) and because of this problem, I am losing information from one side. Does anyone have a suggestion of what could I do?

Here is a proper pair with different TLEN sing:

MN00409:35:000H2KJ2J:1:11102:12030:17923        99      CP047231        3573289 255     151M    =       3573330 **192**     **AACTTTTCCGGCTTCCCGTTCGTCAGTACCTCGGGAAGCCGCCAACCAGGATAAAATGTCAGCCCTAATCAGCGTTGCAGGATAAAGCACCGCTCACTCTTCAACAGACCGATTTGCACCCCAGCAAATGTAGCGTTATTGTTACCTTCCT** FFFFFFFFFFFFFF/F/FFFFFFFFFF/FFAFFFFFFFFFFF/AFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFAFFFF6F6/6F=FAF/FFFFFFFFFFF=F=FF=FFFFFFAFFFFFFFFFFF=/FFFFAFFFFF AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YS:i:0  YT:Z:CP
MN00409:35:000H2KJ2J:1:11102:12030:17923        147     CP047231        3573330 255     151M    =       3573289 **-192**    **CCAACCAGGATAAAATGTCAGCCCTAATCAGCGTTGCAGGATAAAGCACCGCTCACTCTTCAACAGACCGATTTGCACCCCAGCAAATGTAGCGTTATTGTTACCTTCCTTGCTACAGAGTTCGACAGATATCCCGCTATGACATTCTCCC** AA=F/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/AFFFFFAAFAFFF=FFFFFFFFFFFFFFFF=FF/6/FFFFFF/FF=FFAFFFFFF=FFAFFFFF6F/AFFFFFF6FFFFFFFF6FFF/F/6A=F6 AS:i:0  XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:151        YS:i:0  YT:Z:CP

Here is a "problematic" pair with the same sing TLEN:

MN00409:35:000H2KJ2J:1:11102:12474:20162        99      CP047231        322941  255     112M    =       322941  **-112**    **GGTGATTAAACGTGTGGCGAAGCAGCTCTCGCAGGAAGGCGGCTCGCTGAAGATGTACAACATCGCCGATCGCCTGGAAACGGTGATGTGGGAGAGCAAAAAGATGTTCCCC**        AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFF/AFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF        AS:i:-10        XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:6C5C99     YS:i:-10        YT:Z:CP
MN00409:35:000H2KJ2J:1:11102:12474:20162        147     CP047231        322941  255     112M    =       322941  **-112**    **GGTGATTAAACGTGTGGCGAAGCAGCTCTCGCAGGAAGGCGGCTCGCTGAAGATGTACAACATCGCCGATCGCCTGGAAACGGTGATGTGGGAGAGCAAAAAGATGTTCCCC**        =FFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFAFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF/FFFAFF        AS:i:-10        XN:i:0  XM:i:2  XO:i:0  XG:i:0  NM:i:2  MD:Z:6C5C99     YS:i:-10        YT:Z:CP

Thank you so much :)

alignment • 1.8k views
ADD COMMENT
0
Entering edit mode

both reads are mapped on the very same position : CP047231:322941 so the distance between end and start is always -112;

ADD REPLY
0
Entering edit mode

Still, isn't the OP correct that the signs should not be the same, the SAM spec seems to state that when we cannot decide which one of the pairs is leftmost of the two, one should be declared leftmost and for that the TLEN should be 112 and for the other should be -112

The sequence is the same in both cases because the SAM file will report the aligned sequences on the forward strand (even if the alignment is on the reverse) so it will reverse complement the corresponding sequences. In the case you show the fragment is exactly of the same length as the read and each read fully contains the fragment.

But I would agree that the SAM field is incorrect, what aligner are you using? Perhaps use a different one if possible.

ADD REPLY
0
Entering edit mode

Thanks so much or your answers, Now I understand why the sequence is the same! I am using Bowtie2

ADD REPLY
2
Entering edit mode
3.5 years ago
Aspire ▴ 300

This was fixed in in a later Bowtie2 release

From Bowtie2

Version 2.3.5 - March 16, 2019

Fixed issue whereby both ends of a paired-end read could have negative TLEN if they exactly coincided

ADD COMMENT

Login before adding your answer.

Traffic: 2946 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6