Question: Error when running oma-standalone v1.0.6: Error, (in RangeOfChunk_iterator) sequence 2 is too long
0
gravatar for javier.herrero
2.6 years ago by
javier.herrero10 wrote:

Dear all

I am trying to run OMA standalone on a set of primate genomes, using DNA sequences instead of AA. I am running OMA v1.0.6.

Some/many of my jobs are failing with the following error message (I am using -d 3 to get more information):

only_run_allall := true
only_run_dbconv := false
# SetRandSeed: SetRand(261301471):
Starting database conversion and checks...
Process 23089 on node-u04a-022: job nr 7 of 500
job 7 [pid 23089]: conversion done; waited for 0 sec
[pid  23089]: Computing gorilla vs human (Part 1281 of 2384). Mem: 0.288GB
   [pid  23089]: 5.00% complete, time left for this part=0.08h, 11.2% of AllAll done. Mem: 0.288GB
Error, (in RangeOfChunk_iterator) sequence 2 is too long
        executing statement: iterate(FullInd2Tuple(index,GS[name2,TotEntries]))
        locals defined as: name1 = gorilla, name2 = human, chunk = 1281, 
totChunks = 2384, all = 2383042332, first = 1279485816, last = 1280485414, index
 = 1279547847
        RangeOfChunk_iterator called with arguments: RangeOfChunk(gorilla,human,
1281)

Is this because some of the sequences are too long? If that is the case, what is the limit? Otherwise, what could I do to fix this issue?

Thanks

oma orthologs • 873 views
ADD COMMENTlink modified 2.6 years ago by adrian.altenhoff440 • written 2.6 years ago by javier.herrero10

Further to my previous question, I have been playing with a toy dataset. Indeed, increasing one of the sequences artificially beyond 100,000 bp seems to trigger this error.

Is there any workaround or any other recommended solution apart from either truncating or completely skipping this sequence?

In case you haven't guessed it, this error is triggered by the titin gene.

Thanks again

ADD REPLYlink modified 2.6 years ago • written 2.6 years ago by javier.herrero10
1
gravatar for adrian.altenhoff
2.6 years ago by
Switzerland
adrian.altenhoff440 wrote:

Hi Javier,

yes, there is a hard limit of currently slightly over 100k AA. The value of the constant doesn't really matter too much. For the next release, I will increase this number to 200k. Do you think that would be enough? In the meanwhile, you can either skip these very long sequences, truncate them or I can send you a alternative binary directly.

Best wishes Adrian

ADD COMMENTlink written 2.6 years ago by adrian.altenhoff440

Hi Adrian

Thank you for confirming. 200k should do the trick in my case.

I was thinking that in addition to increase the limit, you could filter out long sequences at the DB building stage just as you filter short ones. This way that error would not be triggered.

Regards, Javier

ADD REPLYlink written 2.6 years ago by javier.herrero10

yes, that is a good point. thanks, will include!

ADD REPLYlink written 2.6 years ago by adrian.altenhoff440
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 753 users visited in the last hour