1000 genome download
2
1
Entering edit mode
4.0 years ago
brendaumoh6 ▴ 10

Please I need directive on how to download the phase3 1000 genome of African population

gene • 3.3k views
ADD COMMENT
0
Entering edit mode

Did you take a look at the FAQ provided by 1000 genomes project?

ADD REPLY
0
Entering edit mode

Yes,I did but all I saw was values, I dont really know which is for which population.

ADD REPLY
0
Entering edit mode
4.0 years ago

Hey brendaumoh6,

If you follow steps 1-5 of my tutorial ( Produce PCA bi-plot for 1000 Genomes Phase III - Version 2 ), you will have the entire phased 1000 Genomes Phase III dataset on your disk, which can be used time and time again for future analyses. Information about the African population will be in the PED file that you also download - this can be used to filter the data for just the African samples.

Unfortunately, I am not aware of anybody who has split the 1000 Genomes data into the individual population groups. It's likely something that I would do if I actually had a tenured academic position.

Kevin

ADD COMMENT
0
Entering edit mode
4.0 years ago

As others have noted, the primary-source way to do this is to use a pedigree file provided by 1000 Genomes to filter the full dataset down to just the African samples of interest (which correspond to a superpopulation of "AFR").

A quick alternative is to use the plink2-format fileset posted at https://www.cog-genomics.org/plink/2.0/resources#1kg_phase3 . This includes SuperPop and Population annotations for each sample, so the following command line extracts just the African samples (assuming the .pvar file is still compressed, that's what the 'vzs' refers to):

plink2 --pfile all_phase3 vzs \
       --keep-cat-pheno SuperPop \
       --keep-cat-names AFR \
       --make-pgen \
       --out afr_phase3

and you can convert to BCF format with

plink2 --pfile afr_phase3 \
       --export bcf
ADD COMMENT
0
Entering edit mode

Thanks for your respond. I have downloaded the phase3_corrected.psam\?dl\=1 file from plink2 website. I ran the command line :

plink2 --pfile all_phase3 vzs \
       --keep-cat-pheno SuperPop \
       --keep-cat-names AFR \
       --make-pgen \
       --out afr_phase3
But I got a debug message:
Start time: Wed Apr 29 11:14:10 2020
193440 MiB RAM detected; reserving 96720 MiB for main workspace.
Using up to 16 threads (change this with --threads).
Error: Failed to open all_phase3.pvar.zst?dl=1.pgen : No such file or
directory.

How do I resolve this is issue?

ADD REPLY
0
Entering edit mode

After downloading, you need to rename phase3_corrected.psam to all_phase3.psam, to match the other two files; sorry about not explicitly stating this in the initial answer.

ADD REPLY
0
Entering edit mode

Same error message output but this time no such directory ".pgen"

Start time: Wed Apr 29 14:59:09 2020
193440 MiB RAM detected; reserving 96720 MiB for main workspace.
Using up to 16 threads (change this with --threads).
Error: Failed to open all_phase3.pgen : No such file or directory.
End time: Wed Apr 29 14:59:09 2020
ADD REPLY
0
Entering edit mode

To help, please confirm your PLINK version, and always show the exact commands that you are using. Also confirm that files are in the directories where they are supposed to be in relation to the command(s) that you are running.

ADD REPLY
0
Entering edit mode

Did you download and decompress the .pgen file from the website? You need to follow the instructions on that page.

ADD REPLY
0
Entering edit mode

Thank you all,it finally worked. I renamed my .pvar from this all_phase3.pvar.zst?dl=1 to all_phase3.pvar.zst. I also decompressed the .pgen file.

ADD REPLY
0
Entering edit mode

Out of curiosity, what browser are you using on what operating system, and how are you clicking on the links to download the files? When I click on the links with either Chrome, Firefox, or Safari, across multiple computers, the saved files do not have "?dl=1" at the end of the names.

ADD REPLY
0
Entering edit mode

It seems that it was likely wget. I just tried via wget and it saves it as per the user reported:

wget https://www.dropbox.com/s/qv61mgtx6pz54fz/chr1_phase3.pgen.zst?dl=1

Works via the browser though.

ADD REPLY
0
Entering edit mode

Am using linux OS, with firefox. Though I had the 'dl=1' attached to my file I renamed the file after downloading it on linux.

ADD REPLY

Login before adding your answer.

Traffic: 3332 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6