Issues with Centrifuge indexing
1
0
Entering edit mode
4 months ago

Hi! I am very new to bioinformatics and am trying to follow the centrifuge tutorial (https://ccb.jhu.edu/software/centrifuge/manual.shtml#centrifuge-example). I am having issues building an index using Centrifuge build.

I am trying to run this code from the tutorial on Centrifuge 1.0.4 Beta:

centrifuge-build -p 4 --conversion-table seqid2taxid.map \
--taxonomy-tree taxonomy/nodes.dmp --name-table taxonomy/names.dmp \
input-sequences.fna abv


My output looks like this:

Settings:

  Output files: "abv.*.cf"

Line rate: 7 (line is 128 bytes)

Lines per side: 1 (side is 128 bytes)

Offset rate: 4 (one in 16)

FTable chars: 10

Strings: unpacked

Local offset rate: 3 (one in 8)

Local fTable chars: 6

Max bucket size: default

Max bucket size, sqrt multiplier: default

Max bucket size, len divisor: 4

Difference-cover sample period: 1024

Endianness: little

Actual local endianness: little

Sanity checking: disabled

Assertions: disabled

Random seed: 0

Sizeofs: void*:8, int:4, long:8, size_t:8

Input files DNA, FASTA:

input-sequences.fna

Calculating joined length

Reserving space for joined string

Joining reference sequences

Killed


Any help would be greatly appreciated. Thank you!

Centrifuge data • 465 views
0
Entering edit mode

Are you running out of memory? How much memory do you have available?

0
Entering edit mode

Hi! I don't think I am running out of memory. I've requested 45000 from our server, and the file is only 86 GB. I can request more memory if you think that would help! Thank you!

0
Entering edit mode

Is that 45G? Then you should try asking for more.

0
Entering edit mode

Hi, I think it's actually 4.5Gb. I've just been given a green light for 320G so I'm going to give that a whack. Thank you so much!

0
Entering edit mode
4 months ago

Killed usually means that was terminated by the operating system (as opposed to a program crashing out).

So it does sound like it is running out of resources.

Try making a smaller file (a subset of what you have) =, the see if you can run the process all the way.