Question: Convert SNP dataset from PACKEDANCESTRYMAP to plink (PED)
gravatar for BlastedBadger
3.3 years ago by
BlastedBadger90 wrote:

I have downloaded the "Affymetrix Human Origins Curated" dataset from David's Reich Lab, but am totally at loss to understand which format it is, and how I could convert it to something usable by plink

So far, I have downloaded the utility convertf, from the AdmixTools package for example. Based on the convertf README, I assumed that the .geno, .snp and .ind files are in "PACKEDANCESTRYMAP" format.

I attempted to convert them to PED using the following "parfile" for convertf:

genotypename:    panel1.geno
snpname:         panel1.snp
indivname:       panel1.ind
outputformat:    PED
genotypeoutname: panel1-PED.ped
indivoutname:    panel1-PED.pedind

Then convertf -p parfile seems to work, but the output format is not accepted by plink!

I tried this command to test:

 plink1 --no-web --file panel1-PED.ped --make-bed --out panel1-BED

And it failed like this:

|        PLINK!       |     v1.07      |   10/Aug/2009     |
|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
|  For documentation, citation & bug-report instructions:  |
|        |

Skipping web check... [ --noweb ]
Writing this text to log file [ panel1-BED.log ]
Analysis started: Fri Oct 13 11:51:03 2017

Options in effect:
        --ped panel1-PED.ped
        --out panel1-BED

ERROR: Problem with MAP file line:
1  Affx-4964829     0.013491      1349123 A G

So the map file is incorrectly formatted, it has these 2 extra unwanted columns at the end.

My question is: why doesn't convertf output a correct map format? And is it safe to remove these two last columns using awk or sed (I did it, and plink seemed to make the conversion)?

snp plink • 2.1k views
ADD COMMENTlink written 3.3 years ago by BlastedBadger90

The convertf from AdmixTools is really not working as it should. For example, the .fam file produced by using "PACKEDPED" output does not contain any population information anymore. The first column (family IDs) is just a row number... I am gonna try with Eigensoft.

ADD REPLYlink written 3.3 years ago by BlastedBadger90

Alright, convertf from Eigentools is doing the same

ADD REPLYlink written 3.3 years ago by BlastedBadger90

I've met the same problem. Did you manage to solve it?

ADD REPLYlink written 5 months ago by Nuthatch0

Hi guys, I've recently found this script, which does the opposite conversion (VCF > plink > admixtools), but you can probably explore its logic to get what you need. In particular it uses several awk commands to tweak the intermediates between vcftools and convertf.

ADD REPLYlink modified 6 weeks ago • written 6 weeks ago by cicindel40
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1480 users visited in the last hour