Hi everyone,
I want to count the reads per allele using pysam.The script I have pasted here is only for a variant,but I am doing for many variants in the main script.I want to filter out the reads that overlap,so that I am not double counting them.Basically I want to check the difference in the count of reads per allele with vs without overlap.This can not be done using pysam functionality,so I am trying to do something without it.Below is my script:
#! /usr/bin/python
import sys
import random
import pysam
samfile = pysam.Samfile("f1.bam", "rb")
out = open('outfile', 'w')
f= open("f2.vcf", "r")
ref_counter=0
alt_counter=0
ref='G'
alt='A'
total=0
for pileupcolumn in samfile.pileup( 'chr1',881626,881628):
if int(pileupcolumn.pos) == 881626:
if int(pileupcolumn.n) <= 300:
for pileupread in pileupcolumn.pileups:
if str(pileupread.alignment.seq[pileupread.qpos]) == str(ref):
ref_counter = float(ref_counter) + 1
elif str(pileupread.alignment.seq[pileupread.qpos]) == str(alt):
alt_counter = float(alt_counter) + 1
total=ref_counter+alt_counter
print(str(ref) + '\t' + str(ref_counter) +'\t' + str(alt) + '\t' + str(alt_counter) + '\t' + str(total) + '\n')
Any suggestions?
I reformatted your post so that it should now display correctly. While I think the source code is properly formatted, you might want to double check.