Question: OMA error in reading the all-against-all files
0
gravatar for nc52
17 days ago by
nc5210
nc5210 wrote:

Hi there,

I am having a problem with OMA not being able to read files. The error message is:

****

Reading the all-against-all files...

Error, '
unable to ReadProgram(Cache/AllAll/Dimm_proteins_A/Dimm_proteins_A/part_78-177)'

Its no surprise that it can't read it because I checked the Cache and this part has not been created. I am not sure how to deal with this. I am running OMA standalone on a cluster using the queue manager Sun Grid Engine. The job has stalled at this point (all-against all is not complete). The job was submitted like so:

export NR_PROCESSES=100
qsub -t 1-$NR_PROCESSES OMA.sh

The error message is reported in the most recently modified output file. I have read on some other posts that similar errors can be fixed by deleting the part that OMA is having trouble with and relaunching the job but considering this one does not exist at all I am unsure what to do. Would be grateful for any advice available. Please let me know if you need more information to understand the problem.

Many thanks, Nicki

oma • 91 views
ADD COMMENTlink modified 17 days ago by adrian.altenhoff520 • written 17 days ago by nc5210
0
gravatar for adrian.altenhoff
17 days ago by
Switzerland
adrian.altenhoff520 wrote:

Hi Nicki,

If that file has never existed, the process should not have reached this point. There are two possibilities to check: (1) The file is usually compressed, so please check if the file Cache/AllAll/Dimm_proteins_A/Dimm_proteins_A/part_78-177.gz exists. This is what the process actually tries to read. If it exists, it might be corrupted and you could resolve it by removing it and restart.

The (2) option would be that on the cluster files older than a certain threshold get purged automatically. This often happens on scratch filesystems. In that case, you would need to touch the files before they get purged.

Hope this will solve your problem.

Best wishes, Adrian

ADD COMMENTlink written 17 days ago by adrian.altenhoff520

Hi Adrian,

Thanks for your quick reply. I have checked the cache and Cache/AllAll/Dimm_proteins_A/Dimm_proteins_A/part_78-177.gz does not exist.

The files that OMA is generating are stored in my own directory so they should not be purged by any automated process. I am not sure about jobs on the cluster that may be taking too long but I dont think this happens.

Any other thoughts appreciated before I have to start from the beginning!!

Many thanks, Nicki

ADD REPLYlink written 16 days ago by nc5210

Hi Nicki, that is indeed quite weird. Did you try starting OMA again. It should anyways generate the missing part files then, without redoing anything that has previously been computed.

Could you also check if you have any checkpoint files in the AllAll directory: find Cache/AllAll -type f -name "*.ckpt"

Adrian

ADD REPLYlink written 16 days ago by adrian.altenhoff520

Hi Adrian,

Sorry had to go to the lab.

I issued the find command exactly as above and nothing was displayed in the terminal.... I dont know what that means.

Also, I relaunched OMA from the same directory without changing anything and it doesnt seem to be working very well.

1) The first produced output file has the same error I originally mentioned above. 2) One of the last output files produced shows the following so it looks as though OMA knows the file is not there...

* At least 1 process appears to be still computing the all-vs-all. * The following file(s) is (are) not yet completed: Cache/AllAll/Dimm_proteins_A/Dimm_proteins_A/part_78-177

** If no other process is running, delete these files and restart.

3) I am also seeing this error in a lot of the output files:

Reading GO file... Error, (in ConvertRawFile) unable to OpenReading(rawfile)

4) A single process out of the 100 is still running and seems to be trying to compute some other part of the all vs all comparisons.

I am a bit stumped. Is there a way I can delete some of the comparisons in a sensible order and sort of strip it back to a certain point in the process?

Many thanks for your help, Nicki

ADD REPLYlink written 16 days ago by nc5210
1

Hi Nicki,

1) it's good that the find command did not returned any result (that means there was no pending checkpoint file). 2) which version of OMA standalone are you using? could it be the same issue than OMA error - download from gene ontology not working 404 ? That should be fixed in OMA standalone 2.3.1 3) that is actually good. the single process is computing the missing part file. once it is done, it will continue with the second stage of OMA standalone (which can not be run in parallel).

Best wishes, Adrian

ADD REPLYlink written 16 days ago by adrian.altenhoff520
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1801 users visited in the last hour