My goal is to evaluate the Sensitivity/Specificity of an indel detection method.
I have a "gold standard" VCF file (ref.vcf) that states where are exactly the insertions and deletions in my genome. And of course, my indel detection method produces its own VCF file (let's call it test.vcf).
To calculate the True Positives, I detect the intersection of test.vcf and ref.vcf (I use exact intersection for the sake of simplicity for now). The False Positives, are the features in test.vcf that are not in ref.vcf. And False Negatives are the features in ref.vcf that are not in test.vcf.
But how would you calculate the True Negatives? I just can't use the number of positions left (too big number!).
You can use the the Positive Predictive Value (thanks Casey for clearing the definition up).
PPV = TP/(TP + FP)
instead of the Specificity:
Sp = TN/(TN+FP)
This has been used in eukaryote gene-prediction where you have a similar case, if you look for coding-regions on a per nucleotide basis, assuming a vast proportion of the genome is not coding. It has the advantage of avoiding the extremely large TN values leading to close to Sp ~ 1 for most cases.
I agree with the comment above, that number really id your True Negative count. And yeah, it will be an absurdly large number depending on your dataset. What you will want to do is look beyond simply calculating sensitivity and specificity. In cases where you have an unbalanced number of entries per class (indel no-indel in this case) you want to start looking at something like the F1-score or the Matthews Correlation Coefficient as a better summary statistic for your comparisons.
Something else to analyze the data is to contruct ROC or Precision-Recall curves so you can see how the specificity and sensitivity are interacting with one another.