Problem iterating through BAM file using pysam
2
1
Entering edit mode
8.2 years ago
c9r1y ▴ 10

Hello,

I am trying to use a for loop in python to loop through a bam file and take out reads that map to certain chromosomes, for further downstream analysis. I am having problems in the beginning, seemingly easiest part of my code.

samfile = pysam.Samfile("mybam.bam", "rb")

positions=["7:151970856", "7:151970856"]

for p in positions:
    start = p.split(":")
    chr=start[0]
    pos=start[1]
    chromreads=[]
    for read in samfile:
    if int(chr) == int(read.tid):
        chromreads.append(read)
    print(len(chromreads))

Output I'm getting:

2525527
0

Output I expect:

2525527
2525527

It seems like it's not even looping through the second position. What am I doing wrong?

Thank you.

pysam python bam sam forloop • 4.8k views
ADD COMMENT
1
Entering edit mode
8.2 years ago
c9r1y ▴ 10

Answered my own question from StackOverflow

ADD COMMENT
1
Entering edit mode
8.2 years ago

You're iterating through the entire Samfile object, consuming it on the first pass of your inner for loop. When you arrive at the second iteration of your outer for loop, there is nothing left in the Samfile iterator and so it raises StopIteration which causes the inner loop to exit.

ADD COMMENT

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6