A paper was published in Genome Medicine about how genotyping accuracy is estimated by comparing variant data with data from large scale sequencing projects such as the 1000 Genomes Project: Estimating Exome Genotyping Accuracy by Comparing to Data from Large Scale Sequencing Projects
Quality metrics are now available at GeneTalk for all uploaded VCF files. V. Heinrich developed a metrics algorithm that compares variant data of the uploaded VCF file against a matching population group from the 1000 Genomes Project.
In the file manager in GeneTalk one can click the quality button to see the plotted genotyping accuracy in two graphics:
The plots display how well a sample matches into a population group of the 1000 Genomes Project data (Non-Metric MDS) and estimates the genotyping accuracy in percent (Genotyping Accuracy).
With exome sequencing becoming a tool for mutation detection in routine diagnostics there is increasing need for platform-independent methods of quality control. We present a genotype-weighted metric that allows comparing all the variant calls of an exome to a high-quality reference dataset of an ethnically matched population. The exome-wide genotyping accuracy is estimated from the distance to this reference set, and does not require any further knowledge about data generation or the bioinformatics involved. The distances of our metric are visualized by non-metric multidimensional scaling and serve as a standardizable score for the quality assessment of exome data.
Any comments about the paper and the quality plots in GeneTalk are appreciated. Visit Gene-Talk.de, register for free and upload our VCF file to estimate the genotyping accuracy.
And this is how an exome with a poor genotyping accuracy would look like. It would be either due to few common variants or too many rare variants that were detected in this exome compared to the FIN population group (that would still be the best matching population to the sample_s exome data)
Now GeneTalk will provide users with an extended quality report for every VCF file (with more than 10.000 lines) that is uploaded onto your free account. It is a premium feature that is open to free user accounts until the end of this year.
Upload a VCF file to your account, get a coffee, take a look at the quality plot!