Question

double free or corruption (!prev): 0x00002aacc7b6b780 *** while running featurecounts

0

Entering edit mode

21 months ago

qstefano ▴ 20

Hello everyone, I'm trying to run featurecounts on several bam files, but after analysing some samples i get the same error:

*** Error in `/home/anaconda3/envs/samtools40/lib/R/bin/exec/R': double free or corruption (!prev): 0x00002aacc7b6b780 ***
======= Backtrace: =========
/lib64/libc.so.6(+0x81489)[0x2aaaab230489]
/lib64/libc.so.6(fclose+0x177)[0x2aaaab21d037]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(SAM_pairer_probe_maxfp+0x4ae)[0x2aaab09d35de]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(SAM_pairer_run_once+0x182)[0x2aaab09d55f2]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(SAM_pairer_run+0x28)[0x2aaab09d89f8]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(fc_thread_wait_threads+0x11)[0x2aaab09f79e1]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(readSummary_single_file+0x204)[0x2aaab09fbe44]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(readSummary+0xde2)[0x2aaab09fcda2]
/home/anaconda3/envs/samtools40/lib/R/library/Rsubread/libs/Rsubread.so(R_child_thread_child+0xe)[0x2aaab097c9de]
/lib64/libpthread.so.0(+0x7dd5)[0x2aaaae51edd5]
/lib64/libc.so.6(clone+0x6d)[0x2aaaab2acead]

I'm running my script on HPC, using rclone for files reading and writing on google drive, the unit have been mounted as follows:

rclone mount remote: /home/gdrive/ --allow-other --buffer-size 512m  --drive-chunk-size 128M --umask 002  --vfs-read-chunk-size-limit off --daemon --use-mmap

I tried changing the options of rclone without success. Do you know how to solve it? thanks

Counting Bash HPC RNA-Seq R • 941 views

ADD COMMENT • link updated 21 months ago by ATpoint 82k • written 21 months ago by qstefano ▴ 20

0

Entering edit mode

Reading/writing from google drive is likely causing this. Have you tried to download a couple of files locally? featureCounts is a stable program and works fine. If you are working on HPC why are you using a hacky solution like this?

ADD REPLY • link 21 months ago by GenoMax 141k

0

Entering edit mode

Yes, featurecounts works perfectly on my local machine, but I have a very large amount of data (19 TB), and these are stored on gdrive

ADD REPLY • link 21 months ago by qstefano ▴ 20

1

Entering edit mode

Perhaps consider doing this in google cloud using a VM running featureCounts. Even then reading 19TB of data is going to be a problem with a remote mount. At least copying data may not be so bad in cloud, if you can work on a few files at a time. These must be hundreds of samples so you will end up with a gigantic matrix once you collate everything.

ADD REPLY • link 21 months ago by GenoMax 141k

0

Entering edit mode

That having said, plan your analysis well and build a good pipeline. It might even make sense to run featureCounts either on chunks or on each file separately. Thing is that featureCounts only returns the result if everything finished properly, and if it crashes 99% through then all is lost while running on each file you can relatively easily build a count matrix by pasting the individual components together and resume counting if some files fail. It is big data, so some parallelization and pipeline control makes sense. I personally like Nextflow for these kinds of things.

ADD REPLY • link 21 months ago by ATpoint 82k

0

Entering edit mode

I agree with @genomax. Generally, these types of I/O-heavy tasks should be done with the data at the location of the processing. I am even surprised HPC staff allows you to mount any external remotes. Isn't that a security vulnerability?

ADD REPLY • link 21 months ago by ATpoint 82k