As the title suggests, I'm trying to find human .vcf files online to download, is there any recommendations? I find it hard locate them.
some common ones include
1000 genomes vcf (large, multi-sample) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/
ClinVar (clinically relevant variants) https://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/
dbSNP VCF https://www.ncbi.nlm.nih.gov/variation/docs/human_variation_vcf/
genome in a bottle, deeply sequenced and validated variants for a small number of samples https://www.nist.gov/programs-projects/genome-bottle
there are many others, but not all human data is open either, so these types of resources are special. depends on your interest
I've downloaded them, however when I try to clean the data using my data preproccesing code, it states that the quality does not match. How come? Any tips? I think I'm rather looking for annotated vcf files if they are not. Happy to clarify if needed, not a bioinformatics expert and thank you for your reply.
i can't really tell you why your code doesn't work if you dont provide it. and, if you describe as much as possible what your goals are, the better we can help. don't hide it! here is a thread on annotating vcfs. you can annotate e.g. the genome in a bottle vcf with the clinvar vcf information Is there a way to annotate existing VCF file with known disease-causing mutations?
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy