task killed during SOAPdenovo's running
0
0
Entering edit mode
5.8 years ago
Yingzi Zhang ▴ 90

Hi all, I was running SOAPdenovo-127mer. The libraries are 12. When importing reads from the 7th library files, it reported that the task was killed and no more other information. Then I re-ran the same command and the task was killed again when importing the 7th library. How please? There was no one killed the task manually. Was there memory issue? Thank you.

Yingzi

assembly software error SOAPdenovo • 2.0k views
ADD COMMENT
0
Entering edit mode

Can you please share the config file contents?

ADD REPLY
0
Entering edit mode

Yes. It's

#config_file#
max_rd_len=100
[LIB]
avg_ins=469
reverse_seq=0
asm_flags=3
rank=1
map_len=32
q1=1_1.fastq
q2=1_2.fastq
[LIB]
avg_ins=467
reverse_seq=0
asm_flags=3
rank=1
q1=2_1.fastq
q2=2_2.fastq
[LIB]
avg_ins=474
reverse_seq=0
asm_flags=3
rank=1
map_len=32
q1=3_1.fastq
q2=3_2.fastq

... 

[LIB]
avg_ins=469
reverse_seq=0
asm_flags=3
rank=1
map_len=32
q1=12_1.fastq
q2=12_2.fastq

I wrote the config file by imitating the examples on the Internet. The insert size values were estimated by a bam-analysis software called qualimap. The fastq files is about 50Gb each.

ADD REPLY
0
Entering edit mode

Likely a memory problem. How much memory are you using?

ADD REPLY
0
Entering edit mode

RAM 768Gb; internal storage 50Tb

ADD REPLY
0
Entering edit mode

Are you the only user on this machine? 768G may not be enough for a large data set (looks like you have ~600G). You may want to normalize the dataset (bbnorm.sh from BBMap suite can do this). That operation may also take a large amount of memory just so you are aware.

ADD REPLY
0
Entering edit mode

Yes i am the only user. I have about 400G reads in total. Did you have experience of how much memory I need?

ADD REPLY
1
Entering edit mode

It would likely depend on unique k-mers you have in your data. You can try @Brian's suggestion in this thread to estimate (How to estimate peak memory usage of SOAPdenovo ).

Are these 12 separate libraries or multiple runs of the same library? You should be able to reduce redundancy of data in either case. Actually having too much coverage is also bad for de novo assemblies (it sounds counter intuitive but it is true: de novo sequence assembly with extremely high coverage ).

ADD REPLY
0
Entering edit mode

They are multiple runs of the same library. I am not clear how to arrange runs in one or different library/libraries. And I am not clear how to set the option rank = neither. I will take your advice to estimate peak memory usage first. :) Thank you. Proof positively, I tried to ran the 7th library solely and it ran smoothly till its finish.

Yingzi

ADD REPLY

Login before adding your answer.

Traffic: 2771 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6