I'm working with coding in c/c++ using the samtools suite.
Starting with the example 'calDepth.c' included in the samtools package, I've tried to make some modifications. Inside the pileup function I can insert these three lines of code:
int dist = (int)pos - (int)pl->b->core.pos; // need to find reference DNA at position // 1 for A, 2 for C, 4 for G, 8 for T and 15 for N uint8_t* seq_ptr = bam1_seq(pl->b); int base = bam1_seqi(seq_ptr,dist);
noting that as defined in the example code, pl is a bam_pileup1_t struct thus pl->b is a bam1_t struct and pl->b->core is a bam1_core_t struct all of which are defined in bam.h.
As expected, base stores the nucleotide. But this is a pileup at a given position with many reads which could have different nucleotides if we were looking at a mutation. So firstly, which read does the 'base' value represent? And more importantly, how can I get the nucleotides for each read?