SNP data in both VCF and FASTA
1
0
Entering edit mode
4.2 years ago
m.m ▴ 10

Dear reader!

I am an undergraduate student working on a project for which I would need SNP-data. The data could be a stretch of g.e. 100k bp from maybe 20-30 individuals. As long as I am checking whether my pipeline properly works, I would need both the VCF-files and FASTA files of the individuals (To confirm that the conversion from VCF to Fasta within my own pipeline works). I tried to find data using google search, but was not successful. As far as I know, the 1000 genome project would provide me with VCF-data, but not their corresponding FASTA sequencing.

I was wondering whether anyone could direct/advise me on where to find what I am looking for.

This is my very first question on Biostars, I hope I am not causing any inconvenience.

Best regards, M

SNP genome data DNA • 733 views
ADD COMMENT
2
Entering edit mode
4.2 years ago

Hello,

Do not worry - you are not causing any inconvenience. The 1000 Genomes FASTA files were available in the past, but perhaps they are no longer available.

Nevertheless, you can download the 1000 Genomes Phase III VCFs by following the first step, here: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2

You can then produce a FASTA file for each individual by following the advice, here: Are there any FASTA files containing 1000 Genomes variants or haplotypes?

Kevin

ADD COMMENT
1
Entering edit mode

Dear Kevin! Thank you very much. That's exactly what I was hoping for.

Yours sincerely, M

ADD REPLY

Login before adding your answer.

Traffic: 2077 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6