Find Genotype concordance
2
1
Entering edit mode
6.3 years ago
venks ▴ 730

Hi,

I had to compare variant calls from two different arrays. Exome sequence and OMNI chip.

I found genoytpe concordance per sample (sample based Genotype concordance) using "GATK Genotype concordance". The output is shown below which gives overall genotype concordance per sample.

   Sample    Non-Ref_Sensitivity     Non-Ref_Discrepancy    Overall_Genotype_Concordance

sample1                     0.923              0.076                         0.997
sample2                     0.930              0.082                        0.997
sample3                     0.914              0.087                        0.996


My question is : How to find genotype concordance per SNP/variant?? Where my output would be something like this

    SNP            Overall_Genotype_Concordance

SNP1                     0.986
SNP2                     0.97
SNP3                     0.994


I tried using GATK's -VariantEval tool. This gives only the overall mean concordance rate for all the SNPs, whereas I would like to see concordance rate for every SNP.

SNP gatk genome sequencing next-gen • 4.7k views
4
Entering edit mode
6.2 years ago
venks ▴ 730

I found an answer. Thought this might help someone!

You can use snpsift tools to do this

java -jar SnpSift.jar concordance -v file1.vcf file2.vcf > concordance.txt

will do the job for you.

This gives the concordance and discordance counts per SNP and also per sample.

You can find the documentation here

http://snpeff.sourceforge.net/SnpSift.html#concordance

0
Entering edit mode

Hi,

I was running SnpSift for concordance and I got this for example for one subject, can you please help me understand how and which variables here are the most important to determine subject genotype concordance between two vcf files

0
Entering edit mode
6.3 years ago
0
Entering edit mode

Dr. Lindenbaum,

I tried using GATK Genotype Concordance which summarizes. This gives summary stats for concordance rate across Samples. But I am looking for summary of concordance rate across SNPs. Is there an alternate way to do this?

Thank You and your help is much appreciated