I'm trying to run RepeatMasker on an Amazon EC2 machine. I know that I can run these files with the standard settings, (so a single sequence at a time), just fine, but it will take 20 hours this way, (I'm analyzing huge files, which I know isn't optimal with RepeatMasker, but there's no other options I'm aware of for retroelement identification and quantification). I see there has been a similar post before, but it wasn't resolved.
I tried the following command
RepeatMasker -pa 32 S5.fa, and it starts off great. Then just after the refining SINE/ALU step, I get the "can't fork" error and it dies. I'm running this on a C3.8Xlarge and using Ubuntu. I tried
6. Now at 6 I'm able to proceed, and I've switched from a C3.8Xlarge to a r3.2xlarge. I'm 50 batches in to 5000, and no fork error yet.
Any thoughts on why I'm limited to 6 processors? I'm beyond excited to bump up my analysis speed 6x, but I'm also worried that a few hours in I'll get a fork error and have to start from scratch using the standard settings.