1000genomes data to 23&me format.
0
3
Entering edit mode
4.7 years ago
drkmuru ▴ 30

Hi all

I have got access to 1000genomes data using the globus ftp program (i have not downloaded any file as of yet).

I am aiming to download the individual samples for the Sri Lankans (STU) in the project, with a view to run them through the HarappaWorld calculator (either on gedmatch or DIYDodecad) for research purposes (to create a table/chart showing the full range of results and hopefully y/mtdna correlation).

I have found a guide online on how to convert VCF files into the needed 23&Me format.

However, the files on the FTP site are not in VCF format. I am confused on where to start. Which files to download? Which files are able to be converted to a desired format?

And the VCF files on the cloud website appear to divided into huge chromosome files containing all the individuals into one.

Any help will be greatly appreciated.

Thanks

snp genome SNP • 901 views
ADD COMMENT
0
Entering edit mode

I've been looking into analyzing open data with GEDmatch's tools for my thesis project for a few months. You should be able to get genotypes for a specific population from the VCF files on 1000 Genomes Project.

If you are committed to using the files from Globus, you will have to do SNP calling on the FASTA or SAM/BAM files.

I was using this workflow on slide 2 of this presentation as a guide for my learning.

Even when you get the VCF, GEDmatch seems to require the RSIDs to be included, which may be another step. This is where I'm stuck at.

ADD REPLY

Login before adding your answer.

Traffic: 2646 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6