I currently have variant data per sample and the corresponding RNA seq BAMs per sample. What I'm trying to do is tap into the corresponding bam files, determine the number of reads associated with the variant, and then deduce the %ref and %alt alleles in those reads.
So for instance, I have a variant of interest at position
chr2:86042011, where the ref allele is CCAGA and the alt allele is C, and let's say there are 10 reads that cover that variant position - I want to know how many of those reads have the ref allele and how many have the alt allele.
I'm currently trying to use
samtools mpileup -I [bed file] [bam]. So taking the above example again, I feed a BED file that looks like:
chr2 86042010 86042012
The output I then get from running that command is:
chr2 86042011 N 2 CC sJ chr2 86042012 N 2 CC sJ
One of my confusions is how you know how many reads cover your variant. For instance, should my BED file intervals be greater? Am I doing this right?