calculate fragments depth with pysam
1
1
Entering edit mode
6.0 years ago

Hello, I am currently working with pysam to calculate sequencing “depth” of fragments. I use a bam file with Pair End reads and I managed to create a function that extracts the fragment name, length and the start and stop. I use python and the pysam for this project

My problem lies in two points :

the first is to find a better way to calculate and store the fragments. Because the most efficient way is to use a bam file ordered by read names instead of genomic coordinates that way the read is always followed by its mate. But in pysam, I cant use a bam file sorted by read names because it is impossible to create an index (with samtools) for this kind of file and pysam need an index

the second point is to use the data stored to compute the fragment's depth. I ’ll calculate the fragment's depth for each chromosomes using multi-threading.

genome python pysam samtools • 4.0k views
ADD COMMENT
0
Entering edit mode

why do you need the mate ?

ADD REPLY
0
Entering edit mode

because with the mate I can have the position where the fragment ends

ADD REPLY
0
Entering edit mode

isn't RPOS (mate pos) enough ? sometimes one can also find the mate cigar string in the read metadata.

ADD REPLY
0
Entering edit mode

Thank you I search through pysam and found an option to get the rpos

ADD REPLY
0
Entering edit mode

bamCoverage from deepTools would directly do this for you.

ADD REPLY
0
Entering edit mode

yes it does the job pretty well but I want if possible do my own program using pysam to do the same thing

ADD REPLY
0
Entering edit mode

Hi, I a have another question when I observed the fragments length some of them have a negative value like (-54). Why? I use pysam and I only took reads which are paired and correctly mapped.

ADD REPLY
1
Entering edit mode

Please see the SAM specification.

ADD REPLY
1
Entering edit mode
6.0 years ago

For recent versions of pysam and assuming you have PE reads it's simple enough to just use read.template_length to get the full fragment length. If you need to handle SE reads or older pysam versions then you can get slightly more elaborate as shown here. You'll then want to ignore one of the mates (unless they're not properly paired, which you may want to treat specially (in deepTools we would use the median population fragment length in that case)).

ADD COMMENT

Login before adding your answer.

Traffic: 3277 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6