PF_INDEL_RATE calculation and definition
Entering edit mode
8 months ago
bitpir ▴ 230


I was hoping someone could verify how PF_INDEL_RATE is calculated in AlignmentSummaryMetrics.

The definition says: "PF_INDEL_RATE: The number of insertion and deletion events per 100 aligned bases. Uses the number of events as the numerator, not the number of inserted or deleted bases".

The code on github for calculating PF_INDEL_RATE is this:

metrics.PF_INDEL_RATE = MathUtil.divide(indels, (double) metrics.PF_ALIGNED_BASES);

I can't seem to find anywhere that suggests the rate is "per 100 aligned bases". Just want to verify that in a sample with PF_ALIGNED_BASES of 115694467, a PF_INDEL_RATE of 0.0002 would mean that there are 2 indel events in every 1000000 bases (0.0002 in 100 aligned bases), and not 23,139 indel events in 115694467 aligned bases (0.0002 * 115694467).


picard AlignmentSummaryMetrics picardmetrics • 467 views
Entering edit mode
8 months ago
bitpir ▴ 230

Think I just answered my own question. As I suspected, PF_INDEL_RATE was calculated as the code was written (not events per 100 aligned bases) :

#indel events / PF_ALIGNED_BASES

I did a quick calculation of calculating INDELs from the CIGAR string:

samtools view <bam> | cut -f6 | grep "I\|D" | sort | wc -l   
#I = insertion; D = deletion

my INDEL count came to 39262 and from the AlignmentSummaryMetrics, my PF_ALIGNED_BASES was 235464581, which means my indel rate is ~0.00017. The number came close to the reported PF_INDEL_RATE (pair) which was 0.00018.

Hope this helps!


Login before adding your answer.

Traffic: 1439 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6