Hi I am currently attempted to understand how pair end sequencing works. I understand the basics that it sequences from both ends and if there is an overlapping region they can be joined to create longer single reads.
From my understanding would the example below be true, please tell me if it is or isnt.
Example:
So essentially the total seq is the whole of the fragment. This is pair end sequenced providing a forward read (seq1) and a reverse read (seq2). However seq2 will be the complement to seq1 due to way the sequencing occurs.
50 A, 50 C, 50 G
TotalSeq = ' AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG'
So seq1 will be sequenced:
----------------------------------------------------------------------------------------------------->
and seq2 will be sequenced:
<-----------------------------------------------------------------------------------------------------
This provides an area of overlap which will be the site of matching and merging.
50 A, 50 C, 1 G
seq1 = ' AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCG'
1 T, 50 G, 50 C
seq2 = 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGT'
Any help would be greatly appreciated as I am currently attempting to code for a joiner which works fine on the above seq1 and seq2 but on real data absolutely fails, therefore I can only assume my understanding of pair end sequencing is flawed.
Thanks,
Tom
Sorry yes I ment that seq2 was just complemented but are they correct in the sequencing?
Sequences are always 5' to 3', so seq2 needs to be reversed to match how it would occur in sequencing.
Ok thanks, edited so it is correct now.