I've been attempting to use pysam to find how many reads support a particular SNP vs how many reads do not. I have not been able to exactly pinpoint what pysam pileup is doing. Is it exactly like the samtools function?
For example, to compare what exactly is focused on for each function, I've run this...
for reads in bam.fetch(chrom, start = pos, stop = pos + 1): fetch_reads.append(reads.cigarstring) for refsection in bam.pileup(chrom, start = pos, stop = pos + 1): for reads in refsection.pileups: pileup_reads.append(reads.alignment.cigarstring) if not reads.is_refskip: pileup_reads_norefskip.append(reads.alignment.cigarstring) print(len(fetch_reads)) print(len(pileup_reads)) print(len(pileup_reads_norefskip))
And I get wildly different numbers for what I think are the number of reads at the particular site I'm interested in. 81,157492, and 4856 respectively. Also, when I try to append get_num_aligned(reads) to a list (supposed to return the number of aligned bases at a certain pileup column position), pysam gives me back a list of lists.