SNP data in both VCF and FASTA
16 months ago
Dear reader!

I am an undergraduate student working on a project for which I would need SNP-data. The data could be a stretch of g.e. 100k bp from maybe 20-30 individuals. As long as I am checking whether my pipeline properly works, I would need both the VCF-files and FASTA files of the individuals (To confirm that the conversion from VCF to Fasta within my own pipeline works). I tried to find data using google search, but was not successful. As far as I know, the 1000 genome project would provide me with VCF-data, but not their corresponding FASTA sequencing.

I was wondering whether anyone could direct/advise me on where to find what I am looking for.

This is my very first question on Biostars, I hope I am not causing any inconvenience.

Best regards, M

SNP genome data DNA • 333 views
16 months ago


Do not worry - you are not causing any inconvenience. The 1000 Genomes FASTA files were available in the past, but perhaps they are no longer available.

Nevertheless, you can download the 1000 Genomes Phase III VCFs by following the first step, here: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2

You can then produce a FASTA file for each individual by following the advice, here: Are there any FASTA files containing 1000 Genomes variants or haplotypes?


Dear Kevin! Thank you very much. That's exactly what I was hoping for.

Yours sincerely, M


