Issues with Centrifuge indexing
4 months ago

Hi! I am very new to bioinformatics and am trying to follow the centrifuge tutorial (https://ccb.jhu.edu/software/centrifuge/manual.shtml#centrifuge-example). I am having issues building an index using Centrifuge build.

I am trying to run this code from the tutorial on Centrifuge 1.0.4 Beta:

centrifuge-build -p 4 --conversion-table seqid2taxid.map \
--taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \
input-sequences.fna abv


My output looks like this:

Settings:

  Output files: "abv.*.cf"

Line rate: 7 (line is 128 bytes)

Lines per side: 1 (side is 128 bytes)

Offset rate: 4 (one in 16)

FTable chars: 10

Strings: unpacked

Local offset rate: 3 (one in 8)

Local fTable chars: 6

Max bucket size: default

Max bucket size, sqrt multiplier: default

Max bucket size, len divisor: 4

Difference-cover sample period: 1024

Endianness: little

Actual local endianness: little

Sanity checking: disabled

Assertions: disabled

Random seed: 0

Sizeofs: void*:8, int:4, long:8, size_t:8

Input files DNA, FASTA:

input-sequences.fna

Calculating joined length

Reserving space for joined string

Joining reference sequences

Killed


Any help would be greatly appreciated. Thank you!

Are you running out of memory? How much memory do you have available?

Hi! I don't think I am running out of memory. I've requested 45000 from our server, and the file is only 86 GB. I can request more memory if you think that would help! Thank you!

Is that 45G? Then you should try asking for more.

Hi, I think it's actually 4.5Gb. I've just been given a green light for 320G so I'm going to give that a whack. Thank you so much!

4 months ago

Killed usually means that was terminated by the operating system (as opposed to a program crashing out).

So it does sound like it is running out of resources.

Try making a smaller file (a subset of what you have) =, the see if you can run the process all the way.