How to generate "dummy"/"fake" VCFs for testing software?
1
0
Entering edit mode
21 months ago
kynnjo ▴ 70

I need to generate "dummy"/"fake", but "formally valid", VCF data to test the performance of processing pipelines.

The need for such data arises in many context, but at the moment I am most interested in measuring the performance of alternative approaches to merging large numbers (>10K) of single-sample VCFs.

Most of the VCF data that I have ready access to is protected patient data, which limits what I can do with it (e.g. which cloud servers I can upload it to for processing).

Can anyone recommend a method for generating dummy/fake single-sample VCFs?

vcf • 785 views
ADD COMMENT
0
Entering edit mode

1000 genomes?

ADD REPLY
2
Entering edit mode
21 months ago

https://gnomad.broadinstitute.org/downloads in gnomad v3

HGDP + 1KG callset. These files contain individual genotypes for all samples in the HGDP and 1KG subsets.

ADD COMMENT

Login before adding your answer.

Traffic: 1876 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6