I have a number of Human Heart RNASeq samples. I have generated a mpileup file on each of them using a bed file containing a list of positions for which I want the mpileup to be made. I have included the -B option to turn off the BAQ computation in mpileup.
Single sample mpileup:
samtools mpileup -uf hg19.fa -B -l snps_chr17.bed sample1.bam > sample1.mpileup
Multiple sample mpileup (all samples):
samtools mpileup -uf hg19.fa -B -l snps_chr17.bed -b bam_files_list.bam > all_samples.mpileup
Now, I want to output the frequency of nucleotides A, T, G and C at each of those positions. There is a perl script pileup2baseindel which generates exactly the output that I want, only that it is incorrect (inconsistent with what is seen in IGV). This program takes a single/multi-sample mpileup file and produces the following output. But like I said, the output is incorrect.
I have searched a lot but can't seem to find any program that does the same. Does anyone know any tool/script that would give me such an output with/without taking the mpileup file as an input. Suggestions on alternatives/options are welcomed.
[UPDATE] There was a bug in the perl script and I have updated the code, it works fine now. I have contacted the author and as soon as the author responds, I will send it to him so that he can update it. I would still like to get suggestions and try out other programs.
[UPDATE] Thanks everyone who helped me with this problem. Each of your suggestions worked but I accepted Chris Miller's answer recommending the use of bam-readcount because of its easy of use and the elaborate output that it generates. It directly takes, as input, a bam file as well as a list of positions in bed format for which you want the nucleotide frequency. So you don't need to create an intermediate mpileup file - as in pileup2baseindel or run the program against positions that you are not interested in - as in pysamstats.