Question

Can't convert SRA to FASTQ with fasterq-dump. fasterq-dump was killed (signal 9 SIGKILL)

0

Entering edit mode

16 months ago

hazirliver ▴ 10

Hi! I'm trying to convert .SRA files to .fastq with fasterq-dump tool but on each file have same error fasterq-dump was killed (signal 9 SIGKILL) For example, the processing logs for files SRX2481673/SRR5164647.sra and SRX2481676/SRR5164650.sra are shown below. But I get exactly the same error on other files.

INFO - SRX2481673, SRX2481676 will be processed
INFO - Processing SRX2481673/SRR5164647.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size     : 1,073,741,824 bytes
INFO - mem-limit    : 13,958,643,712 bytes
INFO - threads      : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.74/'
INFO - total ram    : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode   : on
INFO - output-file  : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - output-dir   : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/'
INFO - output       : '/home/asokolov/osrp/tmp/FASTQs/SRX2481673/SRR5164647.fastq'
INFO - append-mode  : 'NO'
INFO - stdout-mode  : 'NO'
INFO - seq-defline  : '@$ac.$si $sn length=$rl'
INFO - qual-defline  : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned   : 'NO'
INFO - accession     : 'SRR5164647'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481673/SRR5164647.sra'
INFO - est. output          : 21,238,500,400 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS)      : 1,462,599,024,640 bytes
INFO - disk-limit-tmp (OS)  : 1,462,599,024,640 bytes
INFO - out/tmp on same fs   : 'NO'
INFO - 
INFO - SRR5164647 is local
INFO - ... has a size of 3,611,444,133 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 39,822,941
INFO - SEQ.spot_count = 39,822,941
INFO - SEQ.total_base_count = 7,831,644,330
INFO - SEQ.bio_base_count = 7,831,644,330
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 77,821,594
INFO - ALIGN.spot_count = 77,821,594
INFO - ALIGN.total_base_count = 7,651,115,820
INFO - ALIGN.bio_base_count = 7,651,115,820
INFO - 
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)
INFO - Processing SRX2481676/SRR5164650.sra
INFO - cursor-cache : 1,073,741,824 bytes
INFO - buf-size     : 1,073,741,824 bytes
INFO - mem-limit    : 13,958,643,712 bytes
INFO - threads      : 3
INFO - scratch-path : '/home/asokolov/osrp/tmp/tmp_files/fasterq.tmp.fasterq-dump-instance-1-1544c93d1a6045778cd95824bb59b763.130/'
INFO - total ram    : 810,184,228,864 bytes
INFO - output-format: FASTQ split file
INFO - check-mode   : on
INFO - output-file  : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - output-dir   : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/'
INFO - output       : '/home/asokolov/osrp/tmp/FASTQs/SRX2481676/SRR5164650.fastq'
INFO - append-mode  : 'NO'
INFO - stdout-mode  : 'NO'
INFO - seq-defline  : '@$ac.$si $sn length=$rl'
INFO - qual-defline  : '+$ac.$si $sn length=$rl'
INFO - only-unaligned : 'NO'
INFO - only-aligned   : 'NO'
INFO - accession     : 'SRR5164650'
INFO - accession-path: '/home/asokolov/osrp/tmp/SRAs/SRX2481676/SRR5164650.sra'
INFO - est. output          : 16,726,708,244 bytes
INFO - disk-limit-tmp input : 128,849,018,880 bytes
INFO - disk-limit (OS)      : 1,462,598,959,104 bytes
INFO - disk-limit-tmp (OS)  : 1,462,598,959,104 bytes
INFO - out/tmp on same fs   : 'NO'
INFO - 
INFO - SRR5164650 is local
INFO - ... has a size of 2,857,778,130 bytes
INFO - ... is cSRA with alignments
INFO - ... SEQ has NAME column = YES
INFO - ... SEQ has SPOT_GROUP column = YES
INFO - ... uses 'SEQUENCE' as sequence-table
INFO - SEQ.first_row = 1
INFO - SEQ.row_count = 31,376,520
INFO - SEQ.spot_count = 31,376,520
INFO - SEQ.total_base_count = 6,166,997,722
INFO - SEQ.bio_base_count = 6,166,997,722
INFO - SEQ.avg_name_len = 1
INFO - SEQ.avg_spot_group_len = 7
INFO - SEQ.avg_bio_reads_per_spot = 2
INFO - SEQ.avg_tech_reads_per_spot = 0
INFO - ALIGN.first_row = 1
INFO - ALIGN.row_count = 61,510,763
INFO - ALIGN.spot_count = 61,510,763
INFO - ALIGN.total_base_count = 6,043,540,736
INFO - ALIGN.bio_base_count = 6,043,540,736
INFO - 
INFO - disk-limit(s) not exeeded!
INFO - fasterq-dump was killed (signal 9 SIGKILL)

I run fasterq-dump with following arguments:

bufsize='1G',
curcache='1G',
mem='13G',
threads=3,
disk-limit-tmp = '120G'

What could be the problem and how can I fix it?

fastq fasterq-dump • 2.1k views

ADD COMMENT • link updated 16 months ago by GenoMax 141k • written 16 months ago by hazirliver ▴ 10

1

Entering edit mode

Save yourself the trouble and download the fastq's using:

#!/usr/bin/env bash
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR516/007/SRR5164647/SRR5164647_1.fastq.gz -o SRR5164647_GSM2452298_Donor1_CD33_CAR_Homo_sapiens_RNA-Seq_1.fastq.gz
curl -L ftp://ftp.sra.ebi.ac.uk/vol1/fastq/SRR516/007/SRR5164647/SRR5164647_2.fastq.gz -o SRR5164647_GSM2452298_Donor1_CD33_CAR_Homo_sapiens_RNA-Seq_2.fastq.gz

This script was generated by https://sra-explorer.info. Information on how to use: sra-explorer : find SRA and FastQ download URLs in a couple of clicks

ADD REPLY • link 16 months ago by GenoMax 141k

0

Entering edit mode

I don't think this is a good solution, because I pass the list of SRXs that I need to process dynamically in the script

ADD REPLY • link 16 months ago by hazirliver ▴ 10

1

Entering edit mode

You can use ffq - https://github.com/pachterlab/ffq - to dynamically obtain the download URLs from SRXs.

ADD REPLY • link 16 months ago by dsull ★ 5.8k

0

Entering edit mode

Thank you! I think this is the best solution for my task

ADD REPLY • link 16 months ago by hazirliver ▴ 10

0

Entering edit mode

Wanted to mention it in case you were able to use it.

sigkill 9 indicates immediate process termination. Are you exceeding any other resource allocations for your account (e.g. RAM) since the message above says that disk-limits were not exceeded.

ADD REPLY • link 16 months ago by GenoMax 141k

0

Entering edit mode

I don't think the RAM limits were exceeded. This task is run on a separate machine with lots of RAM Memory limits for fasterq-dump are set to 13GB. There are no other tasks running on the machine and there is a lot of free memory left.

ADD REPLY • link 16 months ago by hazirliver ▴ 10

1

Entering edit mode

the docs for fasterq dump say it can use up to three times as much RAM as claimed

in general both fastq-dump and fasterq-dump are badly written programs, commonly causing all manner of weird errors and problems

ADD REPLY • link 16 months ago by Istvan Albert 100k

0

Entering edit mode

Hi, If I've understood correctly the log then the scratch space (tmp) is in your home dir. There is a parameter (I think -t ) that could be used to explicitly specify different location of scratch. Have you tried that already?

ADD REPLY • link 16 months ago by Amitm ★ 2.2k