Getting fragment from alignment file
1
0
Entering edit mode
8 weeks ago
derealme • 0

Beginner question

Using pysam, is there a simple way to reconstruct a fragment, by which I mean the full reference for the fragment (padded), read1 and read2?

Thanks

alignment fragment • 263 views
ADD COMMENT
0
Entering edit mode
8 weeks ago

Strictly speaking, I would say the answer is no since you can't possibly know what is in the middle, the region that has not been sequenced.

You can of course reconstruct a likely fragment, by taking the reference sequence from the end of the leftmost read to the start of the rightmost pair and assume that to be the missing region.

REFERENCE   -------------------------------------
READ1 and 2    ---->         <----
                    ^^^^^^^^^

FRAGMENT      |------------------|
ADD COMMENT
0
Entering edit mode

sorry I wasn't clear. obviously you're right but I'm interested only in overlapping reads so that shouldn't be a problem

ADD REPLY
0
Entering edit mode

in that case, the solution is simpler, fuse the reads, and you have your fragment

there are many read-merging programs out there

https://ccb.jhu.edu/software/FLASH/

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2579-2

I would use those instead of PySam, but if you wanted to use PySam then you can use various strategies to figure out the overlap and then concatenate the reads with that information, though in the end, you would be reimplementing some read, merging yourself, so why bother?

the main challenge in using PySam is that there is already an alignment, and parsing that out correctly (depending on the situation) can be complicated.

REFERENCE   -------------------------------------
READ1          --------------->         
READ2                  <-----------------

FRAGMENT      1--------2------3----------

Pysam will give you the coordinates indicated at the symbols 1 and 2 from the CIGAR string you can compute the coordinate at 3 from those pieces you can compute how much to "chop" off from the second read upon concatenating. Some caveats may still apply, though,

it is best to recreate the fragments from the reads rather than after the alignment

ADD REPLY

Login before adding your answer.

Traffic: 2265 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6