Can't convert SRA to FASTQ with fasterq-dump. fasterq-dump was killed (signal 9 SIGKILL)
0
0
Entering edit mode
16 months ago
hazirliver ▴ 10

Hi! I'm trying to convert .SRA files to .fastq with fasterq-dump tool but on each file have same error fasterq-dump was killed (signal 9 SIGKILL) For example, the processing logs for files SRX2481673/SRR5164647.sra and SRX2481676/SRR5164650.sra are shown below. But I get exactly the same error on other files.

INFO - SRX2481673, SRX2481676 will be processed
INFO - Processing SRX2481673/SRR5164647.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size     : 1,073,741,824 bytes
INFO - mem-limit    : 13,958,643,712 bytes
INFO - threads      : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.74/'
INFO - total ram    : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode   : on
INFO - output-file  : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - output-dir   : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/'
INFO - output       : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - append-mode  : 'NO'
INFO - stdout-mode  : 'NO'
INFO - seq-defline  : '@$ac.$si $sn length=$rl'
INFO - qual-defline  : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned   : 'NO'
INFO - accession     : 'SRR5164647'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481673/SRR5164647.sra'
INFO - est. output          : 21,238,500,400 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS)      : 1,462,599,024,640 bytes
INFO - disk-limit-tmp (OS)  : 1,462,599,024,640 bytes
INFO - out/tmp on same fs   : 'NO'
INFO - 
INFO - SRR5164647 is local
INFO - ... has a size of 3,611,444,133 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 39,822,941
INFO - SEQ.spot_count = 39,822,941
INFO - SEQ.total_base_count = 7,831,644,330
INFO - SEQ.bio_base_count = 7,831,644,330
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 77,821,594
INFO - ALIGN.spot_count = 77,821,594
INFO - ALIGN.total_base_count = 7,651,115,820
INFO - ALIGN.bio_base_count = 7,651,115,820
INFO - 
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)
INFO - Processing SRX2481676/SRR5164650.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size     : 1,073,741,824 bytes
INFO - mem-limit    : 13,958,643,712 bytes
INFO - threads      : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.130/'
INFO - total ram    : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode   : on
INFO - output-file  : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - output-dir   : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/'
INFO - output       : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - append-mode  : 'NO'
INFO - stdout-mode  : 'NO'
INFO - seq-defline  : '@$ac.$si $sn length=$rl'
INFO - qual-defline  : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned   : 'NO'
INFO - accession     : 'SRR5164650'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481676/SRR5164650.sra'
INFO - est. output          : 16,726,708,244 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS)      : 1,462,598,959,104 bytes
INFO - disk-limit-tmp (OS)  : 1,462,598,959,104 bytes
INFO - out/tmp on same fs   : 'NO'
INFO - 
INFO - SRR5164650 is local
INFO - ... has a size of 2,857,778,130 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 31,376,520
INFO - SEQ.spot_count = 31,376,520
INFO - SEQ.total_base_count = 6,166,997,722
INFO - SEQ.bio_base_count = 6,166,997,722
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 61,510,763
INFO - ALIGN.spot_count = 61,510,763
INFO - ALIGN.total_base_count = 6,043,540,736
INFO - ALIGN.bio_base_count = 6,043,540,736
INFO - 
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)

I run fasterq-dump with following arguments:

bufsize='1G',
curcache='1G',
mem='13G',
threads=3,
disk-limit-tmp = '120G'

What could be the problem and how can I fix it?

fastq fasterq-dump • 2.1k views
ADD COMMENT
1
Entering edit mode

Save yourself the trouble and download the fastq's using:

#!/usr/bin/env bash
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR516/007/SRR5164647/SRR5164647_1.fastq.gz -o SRR5164647_GSM2452298_Donor1_CD33_CAR_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR516/007/SRR5164647/SRR5164647_2.fastq.gz -o SRR5164647_GSM2452298_Donor1_CD33_CAR_Homo_sapiens_RNA-Seq_2.fastq.gz

This script was generated by https://sra-explorer.info. Information on how to use: sra-explorer : find SRA and FastQ download URLs in a couple of clicks

ADD REPLY
0
Entering edit mode

I don't think this is a good solution, because I pass the list of SRXs that I need to process dynamically in the script

ADD REPLY
1
Entering edit mode

You can use ffq - https://github.com/pachterlab/ffq - to dynamically obtain the download URLs from SRXs.

ADD REPLY
0
Entering edit mode

Thank you! I think this is the best solution for my task

ADD REPLY
0
Entering edit mode

Wanted to mention it in case you were able to use it.

sigkill 9 indicates immediate process termination. Are you exceeding any other resource allocations for your account (e.g. RAM) since the message above says that disk-limits were not exceeded.

ADD REPLY
0
Entering edit mode

I don't think the RAM limits were exceeded. This task is run on a separate machine with lots of RAM Memory limits for fasterq-dump are set to 13GB. There are no other tasks running on the machine and there is a lot of free memory left.

ADD REPLY
1
Entering edit mode

the docs for fasterq dump say it can use up to three times as much RAM as claimed

in general both fastq-dump and fasterq-dump are badly written programs, commonly causing all manner of weird errors and problems

ADD REPLY
0
Entering edit mode

Hi, If I've understood correctly the log then the scratch space (tmp) is in your home dir. There is a parameter (I think -t ) that could be used to explicitly specify different location of scratch. Have you tried that already?

ADD REPLY

Login before adding your answer.

Traffic: 2541 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6