Hi all,
I have Arlequin output files (*.arp) from fastSimcoal2 that I'm trying to convert to genpop files (to read into adegenet). The Arlequin files are using the "DNA" marker (50 bp long), and each includes ~2,500 samples.
When I try to use the program PGDspider to convert these files (either the GUI or the CLI, v2.1.1.5), things proceed smoothly until the 1,000th allele is reached, where the program generates an error:
INFO 07-03 11:18:10,630 (GenepopWriter.java:parseAlignedSampleData:596) -Allele "TGGCATCAGCTTCTTCTAAGCAGGCCCGCTTTGTCTAGCGACAGATCTCG" converted to "999"
INFO 07-03 11:18:10,630 (GenepopWriter.java:parseAlignedSampleData:596) -Allele "ATCTTGTGGTCATGGAGCACCCAGAGCCAAACTCCTAAAGAAGACCCCGC" converted to "1000"
ERROR 07-03 11:18:10,630 (GenepopWriter.java:parseAlignedSampleData:611) -data "1000" in ind "2_375" (population: Sample 2) is too long!!!
INFO 07-03 11:18:10,631 (GenepopWriter.java:parseAlignedSampleData:618) -In GENEPOP, alleles cannot be coded with more than 3 digits.
Is there a way around this? I've processed RADseq data with many 1,000s of alleles using adegenet before, so I know these types of files can be written to genpop--I just don't know how to specify this to PGDspider. There's a field in the .spid file that I might be able to use to fix this (see below), but I don't know what to populate that field with.
# Specify the locus/locus combination you want to write to the GENEPOP file:
GENEPOP_WRITER_LOCUS_COMBINATION_QUESTION=
Alternatively, if there are other suggestions for performing this conversion without using PGDspider, I'm all ears.
Thanks for your help!