Entering edit mode
18 months ago
bioinformatics2020 ▴ 750
I havent used Genome in a Bottle for a couple of years. When I did use it, I recall I would download samples in VCF format for:
- AshkenaziTrio (three each)
- NA12878 (only one)
- ChineseTrio (three each)
I would then download what used to be called a high-confidence region bed file and intersect each of the VCF files against this. For some reason, it seems like each of these associated files are already labeled as "benchmarked."
Can anybody help me understand what's up now and if I'm misremembering how I went about this a couple of years ago.
There are different releases and the older ones are still named as high-confidence, while newer releases are labeled as benchmarked. The vcf and bed files are intended to be used in conjunction to benchmark accuracy of small variant calls.
You can browse the different releases through the ftp page and the readme file can help with more information
Hope this helps!
Thanks, that helped a lot! Another question; I think I'll just stick to using the older high-confidence bed files for now. Is there only a single common high confidence bed file, or does every group I listed above have their own bed file?
Each group listed above have their own bed file.