Obtain population information of 1092 sample of the 1000 genome project
Entering edit mode
2.5 years ago
lxiao63 • 0

I have downloaded genomic data for 1000 g phase I samples from https://www.ncbi.nlm.nih.gov/projects/faspftp/1000genomes/.

I checked the resultant .FAM file (1092 rows, each corresponds to 1 sample in 1000 g phase I release) and noted that there is a column named member whose first 20 cases are :

HG00096 HG00097 HG00099 HG00100 HG00101 HG00102 HG00103 HG00104 HG00106 HG00108 HG00109 HG00110 HG00111 HG00112 HG00113 HG00114 HG00116 HG00117 HG00118 HG00119

I wish to determine the population (eg, CHB, JPT, CEU) and super population (eg, EAS, EUR, AFR) from the member IDs. To do so, I downloaded pedigree file from https://www.internationalgenome.org/faq/can-i-get-phenotype-gender-and-family-relationship-information-samples/.

The pedigree file has 3501 rows rather than 1092 rows. This file has a column namded Individual ID whose contents are: HG01879, HG01880, HG01881, etc. However, none of the member in my .FAM file can be found among the 3501 rows of the pedigree file! These two files are completely irrelevant.

I would like to ask if it is possible to determine population source of the 1092 1000 g samples from their member ID. If yes, where could I find such meta data that relates ID to population source?

Thank you.

1000 genome project • 590 views
Entering edit mode
2.5 years ago
JC 13k

You can use the 1000Genomes Data portal


Login before adding your answer.

Traffic: 681 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6