Question: Nucletotide distribution, at each position, in a .sam/.bam file ?
i'm trying to extract the nucleotide distribution, for each position, from a .sam/.bam file ?!

I don't look for the total depth of coverage (that can be done with GATK or samtools), but the depth of coverage for each nucleotide, at each position in my alignment file (bam/sam/...)

How can i do that ?

Do you know a tools that can do that ?

Thanks in advance for any answer/suggestion,


duplicate of Coverage In Bam File - Bases And Overall Count

ADD REPLYlink written 3.9 years ago by Pierre Lindenbaum108k
Have a look at pysamstat executed as:

pysamstats -f ref.fa --type variation_strand aln.bam > aln.var.txt

It will give the count of A, C, G, T, insertions and deletions at each position in the reference (is this what you are after?).

If you want to parse the 5th column of samtools mpileup yourself take care that it contains also the mapping qualities and the sequence of insertions and deletions. So just counting the occurrences of ACTG will give slightly incorrect results (I think the answer Pierre links to has this problem).

