Hi everyone,
I want to count the reads per allele using pysam.The script I have pasted here is only for a variant,but I am doing for many variants in the main script.I want to filter out the reads that overlap,so that I am not double counting them.Basically I want to check the difference in the count of reads per allele with vs without overlap.This can not be done using pysam functionality,so I am trying to do something without it.Below is my script:
#! /usr/bin/python import sys import random import pysam samfile = pysam.Samfile("f1.bam", "rb") out = open('outfile', 'w') f= open("f2.vcf", "r") ref_counter=0 alt_counter=0 ref='G' alt='A' total=0 for pileupcolumn in samfile.pileup( 'chr1',881626,881628): if int(pileupcolumn.pos) == 881626: if int(pileupcolumn.n) <= 300: for pileupread in pileupcolumn.pileups: if str(pileupread.alignment.seq[pileupread.qpos]) == str(ref): ref_counter = float(ref_counter) + 1 elif str(pileupread.alignment.seq[pileupread.qpos]) == str(alt): alt_counter = float(alt_counter) + 1 total=ref_counter+alt_counter print(str(ref) + '\t' + str(ref_counter) +'\t' + str(alt) + '\t' + str(alt_counter) + '\t' + str(total) + '\n')
Any suggestions?
I reformatted your post so that it should now display correctly. While I think the source code is properly formatted, you might want to double check.