Kegg Meta Data Download - Out Of Memory Because Of Data Size
1
0
Entering edit mode
12.1 years ago
Siva Kumar ▴ 30

I was trying to get KEGG metadata - Compound IDs, Gene IDs etc. using the bfind method in the Kegg API. I wish to retrieve all the gene IDs from the KEGG db. But the gene count is so huge that, the Java code crashes with out of memory exception. Is there an efficient way to retrieve all the id information, with out duplicates and with efficient memory usage.

Thank you.

kegg java memory • 2.5k views
ADD COMMENT
1
Entering edit mode
12.1 years ago

If you work on a Unix platform like most bioinformaticians do, the simple solution is to retrieve the non-unique ID list directly to a file using curl or wget and use sort -u to produce a unique list.

But since you say you do it in Java, I guess it is not the answer you are looking for. Assuming that you are not actually running out of physical memory on your machine, it may be that you simply need to use the -Xms and -Xmx options to increase the memory allocated to your JVM.

ADD COMMENT
0
Entering edit mode

Thank you. I have actually used the -Xms and -Xmx. My current doubt is with Apache axis 1.4. Kegg API is compatible with only axis 1.4. The memory is not cleared after the data is retrieved and the KeggLocator and KeggPortType were marked as null. Is there a method to dispose them?

ADD REPLY
0
Entering edit mode

I am unfortunately neither Java nor Apache axis guru. But it sounds like you need to wait for garbage collection to take place. Maybe this would be of help to you: http://stackoverflow.com/questions/1481178/forcing-garbage-collection-in-java

ADD REPLY

Login before adding your answer.

Traffic: 2629 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6