PF_INDEL_RATE calculation and definition
1
0
Entering edit mode
18 months ago
bitpir ▴ 240

Hello,

I was hoping someone could verify how PF_INDEL_RATE is calculated in AlignmentSummaryMetrics.

The definition says: "PF_INDEL_RATE: The number of insertion and deletion events per 100 aligned bases. Uses the number of events as the numerator, not the number of inserted or deleted bases".

The code on github for calculating PF_INDEL_RATE is this:

metrics.PF_INDEL_RATE = MathUtil.divide(indels, (double) metrics.PF_ALIGNED_BASES);


I can't seem to find anywhere that suggests the rate is "per 100 aligned bases". Just want to verify that in a sample with PF_ALIGNED_BASES of 115694467, a PF_INDEL_RATE of 0.0002 would mean that there are 2 indel events in every 1000000 bases (0.0002 in 100 aligned bases), and not 23,139 indel events in 115694467 aligned bases (0.0002 * 115694467).

Thanks!

picard AlignmentSummaryMetrics picardmetrics • 644 views
0
Entering edit mode
18 months ago
bitpir ▴ 240

Think I just answered my own question. As I suspected, PF_INDEL_RATE was calculated as the code was written (not events per 100 aligned bases) :

#indel events / PF_ALIGNED_BASES


I did a quick calculation of calculating INDELs from the CIGAR string:

samtools view <bam> | cut -f6 | grep "I\|D" | sort | wc -l
#I = insertion; D = deletion


my INDEL count came to 39262 and from the AlignmentSummaryMetrics, my PF_ALIGNED_BASES was 235464581, which means my indel rate is ~0.00017. The number came close to the reported PF_INDEL_RATE (pair) which was 0.00018.

Hope this helps!