GATK hg19 bundle question
0
0
Entering edit mode
6.5 years ago
genetic ▴ 40

I've downloaded GATK hg19 bundle datasets. Can I use 1000G_phase1.indels.hg19.sites.vcf instead of 1000G_phase1.indels.hg19.vcf? What is difference between these 2 files?

1000G_phase1.indels.hg19.vcf 1000G_phase1.indels.hg19.sites.vcf

Mills_and_1000G_gold_standard.indels.hg19.vcf Mills_and_1000G_gold_standard.indels.hg19.sites.vcf

Thank you in advance. MH

GATK • 2.8k views
ADD COMMENT
0
Entering edit mode

copy/pasted from https://gatkforums.broadinstitute.org/gatk/discussion/1826/indelrealigner-realignertargetcreator-known-site-bundle-files:

the difference between the .vcf and the .sites.vcf files: the .vcf files contain the full callset info including genotypes, while the .sites.vcf files don't contain the genotypes, only the variant sites info. The point of having sites-only files is that they're smaller files

FAQs (https://software.broadinstitute.org/gatk/documentation/article.php?id=1247) mentioned in the same post will be of great help.

ADD REPLY

Login before adding your answer.

Traffic: 2503 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6