Hi, this question is somehow related to this previous question. I'm playing with the C API for BAM and I wrote the following code: https://gist.github.com/736059
I'm now trying to find the indexes of the genomic bases covered by the CIGAR string (my final goal is to create a WIG file containing the coverage of the genome)
What would be the correct way to change my code to get the genomic indexes of the bases covered by each CIGAR element ?
(...)
for( k=0;k< b->core.n_cigar;++k)
{
int cop =cigar[k] & BAM_CIGAR_MASK; // operation
int cl = cigar[k] >> BAM_CIGAR_SHIFT; // length
switch(cop)
{
case BAM_CMATCH: printf("M");break;
case BAM_CINS: printf("I");break;
case BAM_CDEL: printf("D");break;
case BAM_CREF_SKIP: printf("N"); break;
case BAM_CSOFT_CLIP: printf("S");break;
case BAM_CHARD_CLIP: printf("R");break;
case BAM_CPAD: printf("P");break;
default:printf("?");break;
}
printf("%d",cl);
}
(...)
Thanks
Have you looked at bamtools: https://github.com/pezmaster31/bamtools ? It provides a very clean C++ BAM API.
(That noted, it won't help make this any easier.)
Yes thanks, I saw it, but I want to learn the 'official' API and how the data should be handled