Question: Evaluate the sensitivity and specificity of vcf files
0
gravatar for jackycsie
12 months ago by
jackycsie0
jackycsie0 wrote:

Hi,

I have already finished the variant call of NA12878.

My steps are as follows:

  1. bwa mem
  2. SortSamSpark
  3. MarkDuplicatesSpark
  4. BaseRecalibratorSpark
  5. ApplyBQSRSpark
  6. HaplotypeCaller

Now I have a vcf file, but I don't know how to judge his correctness.

My reference data is:

  1. dbsnp_138.b37.vcf.
  2. Mills_and_1000G_gold_standard.indels.b37.vcf
  3. 1000G_phase1.indels.b37.vcf

Thank you, jacky.

gene • 350 views
ADD COMMENTlink modified 10 months ago by Jorge Amigo12k • written 12 months ago by jackycsie0

Hi,

I find this url:

https://ftp-trace.ncbi.nlm.nih.gov/giab/ftp/release/NA12878_HG001/NISTv3.3.2/GRCh37/

It's right or only provide chromosome 1 ?

Thanks, Jacky.

ADD REPLYlink written 12 months ago by jackycsie0

I think this is better suited for the GATK forums. There's this post that might be a good starting point: https://gatkforums.broadinstitute.org/gatk/discussion/6308/evaluating-the-quality-of-a-variant-callset

ADD REPLYlink written 12 months ago by Mark800

although chr1 is one of the biggest ones, and considering that you've already done the largest effort which is calling variants on NA12878, I'd definitely go for the entire genome's numbers rather than being fine with a relatively small subset of statistics.

ADD REPLYlink written 10 months ago by Jorge Amigo12k
1
gravatar for Jorge Amigo
10 months ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

you've chosen a reference sample to apply your variant calling pipeline to. well done, because you're almost there. just get a set of high confidence NA12878 variants such as the GiB project ones, use a comparison tool such as RTG vcfeval (the one recommended by the GiB project), and you'll be done.

ADD COMMENTlink written 10 months ago by Jorge Amigo12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1710 users visited in the last hour