How to iterate through paired-end reads in pysam?
1
0
Entering edit mode
9 weeks ago
zt10122 ▴ 20

I want to process both reads in a paired-end read simultaneously, but don't know how to do this efficiently.If I use for read in bam_file: and bam_file.mate(read) , each read is Iteratived twice, so is there an efficient way to iterate through paired-end reads in pysam?

bam sam samtools pysam bioinformation • 260 views
ADD COMMENT
1
Entering edit mode

each read is Iteratived twice,

each _alignment_

is there an efficient way to iterate through paired-end reads in pysam?

sort your bam by read-name, remove secondary and supplementary alignments.

ADD REPLY
1
Entering edit mode
9 weeks ago

The cheap answer to this is "no". Its a massive headache for many people that write software that process BAM files. Its not pysam's fault, there simply isn't a good way of doing it on am abritrarily or position-sorted BAM file.

The slightly more useful answer, as suggested by @Pierre Lindenbaum, is to sort your BAM by name. You can now collect all alignments for a certain read by iterating until the read name changes. However, be aware that if you have multiple alignments per read, than it can get difficult working out which read goes with which pair (hence @Pierre's suggestion to remove secondary and supplementary alignments). But sometimes you need the secondary alignments and sometimes you need position sorted. In which case, you are basically stuffed.

ADD COMMENT

Login before adding your answer.

Traffic: 1923 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6