formating for fineSTRUCTURE input
1
2
Entering edit mode
5.0 years ago
natasha ▴ 100

Hi

I would like to use fineSTRUCTURE to access the population structure of a bacterial species. Thus I will be inputting SNP data.

However, I don't understand how to create the 'phased' data format that fineSTRUCTURE requires. The fineSTRUCTURE manual lists multiple programmes to help with this phasing process, such as phase, beagle, shapeit, impute2 etc however, I don't know were to even start with these....

For example PHASE requires me to input my data in the following format...

NumberOfIndividuals

NumberOfLoci

P Position(1) Position(2) Position(NumberOfLoci) LocusType(1) LocusType(2) ... LocusType(NumberOfLoci) ID(1)

Genotype(1)

ID(2)

Genotype(2)

.

.

.

ID(NumberOfIndividuals)]

Genotype(NumberOfIndividuals)

But how to I get this?!?!?!

As it stands I have the core genome alignment, the SNP alignment and a VCF of my data. How do I use these formats to phase my data?? Can anyone help to point me in the right direction??

Many many thanks!!!

fineSTRUCTURE format input • 2.1k views
0
Entering edit mode

beagle2chromopainter.pl
chromopainter2chromopainterv2.pl
impute2chromopainter.pl

0
Entering edit mode
2.3 years ago
nataliagru1 ▴ 70

Hello. You need to use a program such as ShapeIT or IMPUTE2 to phase your VCF file. ShapeIT takes VCF files as input and will output a phased file format which you will need to convert to chrompainter format (these scripts and tools are provided on fineStructure website). In addition everything I am mentioning is explicitly written in the fineStructure manual in grave detail. You can utilize any other phasing software you desire but the ones I mentioned are recommended by fineStructure authors.