Question: Large sample sizes in Lositan
Hi Tiago and/or other Lositan experts,

I have a data set of 15,982 markers and 62 individuals across 4 populations.  I downloaded the large sample version of Lositan and successfully loaded the genepop file into the program on my Mac. However when I run the program it crashes with the cyan message bar "simulation pass to determine initial neutral set". The java console printed this error:

Java Web Start
Using JRE version 1.7.0_75-b13 Java HotSpot(TM) 64-Bit Server VM
User home directory = /Users/garylongo
c:   clear console window
f:   finalize objects on finalization queue
g:   garbage collect
h:   display this help message
m:   print memory usage
o:   trigger logging
p:   reload proxy configuration
q:   hide console
r:   reload policy configuration
s:   dump system and deployment properties
t:   dump thread list
v:   dump thread stack
0-5: set trace level to <n>
Missing Application-Name manifest attribute for:
Missing Permissions manifest attribute in main jar:
Mac OS X /Users/garylongo /
0.170016 34
Unhandled exception in thread started by <bound method SplitFDist.monitor of <Bio.PopGen.FDist.Async.SplitFDist object at 0x19a>>
Traceback (most recent call last):
  File "/Users/garylongo/.lositan/Bio/PopGen/FDist/", line 128, in monitor
  File "/Users/garylongo/.lositan/", line 548, in report
    selLoci = getSelLoci(pv)
  File "/Users/garylongo/.lositan/", line 418, in getSelLoci
    p = getP(pv[currPos])
IndexError: index out of range: 0



After some trial and error of reducing the data set size, I did get the program to run to completion when I reduced the data set to 5,002 markers. I see that the program should be able to handle 40,000 markers on Mac and 100,000 on PCs. I'm assuming this number is a product or function of sample size and not strictly loci number.  Is this true? I have seen other posts that suggest reducing the size of the data set or to run separate analyzes but I would like to run the complete data set in a single run in order to analyze all my data and to avoid skewing Fst calculations in separate runs. Any suggestions would be greatly appreciate. 

Thanks for your time!





I don't have experience with Lositan, but I have a simple and effective Fst (weir and Cockerham) for two populations.  It is written in C++ and works directly from a VCF file. 

