Hello, I am currently working with pysam to calculate sequencing “depth” of fragments. I use a bam file with Pair End reads and I managed to create a function that extracts the fragment name, length and the start and stop. I use python and the pysam for this project
My problem lies in two points :
the first is to find a better way to calculate and store the fragments. Because the most efficient way is to use a bam file ordered by read names instead of genomic coordinates that way the read is always followed by its mate. But in pysam, I cant use a bam file sorted by read names because it is impossible to create an index (with samtools) for this kind of file and pysam need an index
the second point is to use the data stored to compute the fragment's depth. I ’ll calculate the fragment's depth for each chromosomes using multi-threading.
why do you need the mate ?
because with the mate I can have the position where the fragment ends
isn't RPOS (mate pos) enough ? sometimes one can also find the mate cigar string in the read metadata.
Thank you I search through pysam and found an option to get the rpos
bamCoverage
from deepTools would directly do this for you.yes it does the job pretty well but I want if possible do my own program using pysam to do the same thing
Hi, I a have another question when I observed the fragments length some of them have a negative value like (-54). Why? I use pysam and I only took reads which are paired and correctly mapped.
Please see the SAM specification.