ADMIXTURE instead STRUCTURE
4
1
Entering edit mode
6.8 years ago
Tohamy ▴ 80

Hey all,

I was planning to use STRUCTURE  to infer population structure within the 200 accessions. But as you know STRUCTURE needs a lot of time. So, I started to think to use ADMIXTURE tool instead STRUCTURE to save the time. My data is now in structure format. The question is how can I convert my data from structure format to required format for ADMIXTURE (.ped.bed).

Thanks,

Tohamy

R • 5.7k views
ADD COMMENT
1
Entering edit mode
6.3 years ago
NRP ▴ 10

Hello -

Did you ever find a solution to this? I'm trying to do the same thing & looking for a way to convert from genepop or structure format to the Eigenstrat format (.geno).

 

 

ADD COMMENT
1
Entering edit mode
6.3 years ago
Brice Sarver ★ 3.6k

I looked into this extensively 6-8 months ago, and I was unable to find a parser. I ended up writing my own, but it was very clunky to get things into Plink format from the (somewhat complicated) STRUCTURE data structure.

Instead, I ended up getting a VCF and using VCFtools to convert that to Plink, then threw that into Admixture/fastStructure. An alternative is to just use fastStructure, which will take old-school STRUCTURE formatting, and run that. I seem to recall that it chokes if you give it multiallelic sites, so you probably want to filter things down to ballelic only if your data has this.

EDIT: It seems that PDGSpider was updated recently. It handles more file formats then when I was messing around with it about a year ago. I would recommend converting STRUCTURE to VCF, then using VCFtools to go from VCF to Plink, assuming everything works well.

ADD COMMENT
1
Entering edit mode
4.9 years ago

hello -

I wrote an R script that convert data files from the STRUCTURE and TESS formats to the geno format. For how to download this script and use it (with R) see this short tutorial http://membres-timc.imag.fr/Olivier.Francois/tutoRstructure.pdf

The tutorial also explains how to process sNMF/ADMIXTURE/fastSTRUCTURE outputs anf produce geographic maps with admixture coefficients overlayed. Only R is required, works for all OS.

Enjoy it,

oliver

ADD COMMENT
0
Entering edit mode

Thanks Olivier! Worked really well and easily for me.

ADD REPLY
0
Entering edit mode
23 months ago
yijiaobani • 0

I find a solution: http://membres-timc.imag.fr/Olivier.Francois/LEA/tutorial.htm

convert: * ifmm * env * geno * ped * ancestrymap * vcf

For converting genotype matrices from the STRUCTURE or the TESS format to the lfmm and geno format, use the struct2geno() function.

ADD COMMENT

Login before adding your answer.

Traffic: 2141 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6