I am using popoolation for my pool-seq data to calculate fst values using
fst-sliding.pl script. As the synchronized file weights ~2TB, I though that it would be better to parallelize the process. I've therefore split the synchronized file into small chunks and I run multiple jobs (max of 50 jobs) at the same time in a cluster.
Each job should take ~1 day. However when I start running multiple jobs I see that about ~20 output files start being filled (increase size file of popoolation output
.fst) but the others remain empty until these first have finished. I know that all of them actually started because I can see the
.params file for all of them.
I have no clue why is this happening, I thought maybe is because all jobs call the same perl script? and somehow can't handle so many calls?? Did someone experienced that before or have any clue how to solve it?