Question

RNA-SeQC memory error

0

Entering edit mode

7.5 years ago

nicholas.owen1 • 0

Dear All,

I have installed RNA-SeQC on our cluster and have been trying to run a number of jobs for RNA-SeQC (version 1.18) on 5-6 Gb BAM files of human RNA-seq data.

Info about the BAM files: human paired-end sequences, aligned to hg38 build genome using STAR, read groups added, sorted and indexed.

The command file for RNA-SeQC is generally:

/share/apps/jdk1.7.0_71/bin/java -Xmx60g -jar /user/tools/RNA-SeQC/RNA-SeQC_v1.1.8.jar \
 -bwa /user/bwa-0.7.10/bwa -BWArRNA \
/user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/rRNA.gtf -t \
/user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Annotation/Archives/archive-2015-08-14-08-18-15/Genes/genes.gtf -r \
/user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.fa -s \
/user/bam/bam_samples.txt -o /user/bam/RNA-SeQC

The bam_samples.txt only includes one BAM now as I was having the same error with many so I am trying to get it sorted just to work with one for now.

From the output it looks like the job is working fine to a certain stage and then I am getting memory errors. I have given the cluster job memory ranging from 16Gb to 128Gb with no luck at all.

The error I am getting is:

RNA-SeQC v1.1.8.1 07/11/14
Retriving contig names from reference
     contig names in reference: 195
Loading GTF for Read Counting
Converting to refGene
Transcript objects to RefGen format:    1 s
Running IntronicExpressionReadBlock Walker ....
Arguments: [-T, IntronicExpressionReadBlock, --outfile_metrics, /user/bam/RNA-SeQC/NC101/NC101.metrics.tmp.txt, -R, /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.fa, -I, /user/bam/NC101/NC101_unique.RG.bam, -refseq, /user/bam/RNA-SeQC/refGene.txt, -l, ERROR]
Finished writing /user/bam/RNA-SeQC/NC101/NC101.metrics.tmp.txt.intronReport.txt
Finished writing /user/bam/RNA-SeQC/NC101/NC101.metrics.tmp.txt.intronReport.txt_intronOnly.txt, now creating RPKM values for introns ..
GATK command result code: 0
     ... GATK CoutReadMetrics Analysis DONE
CountReadMetricsWalker Runtime: 12 min
Counting rRNA reads with BWA and /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/rRNA.gtf
Downsampling before aligning at rate: 0.009106594123052198
INFO    2016-09-15 21:23:02 DownsampleSam   Read 10000000 reads, kept 91285
INFO    2016-09-15 21:23:34 DownsampleSam   Read 20000000 reads, kept 182434
INFO    2016-09-15 21:23:56 DownsampleSam   Read 30000000 reads, kept 272798
INFO    2016-09-15 21:24:18 DownsampleSam   Read 40000000 reads, kept 364148
INFO    2016-09-15 21:24:39 DownsampleSam   Read 50000000 reads, kept 455860
INFO    2016-09-15 21:25:01 DownsampleSam   Read 60000000 reads, kept 547155
INFO    2016-09-15 21:25:24 DownsampleSam   Read 70000000 reads, kept 639031
INFO    2016-09-15 21:25:50 DownsampleSam   Read 80000000 reads, kept 730499
INFO    2016-09-15 21:26:19 DownsampleSam   Read 90000000 reads, kept 822225
INFO    2016-09-15 21:26:41 DownsampleSam   Read 100000000 reads, kept 912608
INFO    2016-09-15 21:27:10 DownsampleSam   Finished! Kept 1001492 out of 109810538 reads.
Downsampling exited with code: 0
BWA on end 1
Running BWA on /user/bam/RNA-SeQC/NC101/dSample.bam
Command: [/user/bwa-0.7.10/bwa, aln, /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/rRNA.gtf, -b1, /user/bam/RNA-SeQC/NC101/dSample.bam]
#
# There is insufficient memory for the Java Runtime Environment to continue.
# pthread_getattr_np
/opt/gridengine/default/spool/lum-7-13/job_scripts/246455: line 28: 57179 Aborted                 /share/apps/jdk1.7.0_71/bin/java -Xmx61440M -jar /user/tools/RNA-SeQC/RNA-SeQC_v1.1.8.jar -bwa /user/bwa-0.7.10/bwa -BWArRNA /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/rRNA.gtf -t /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Annotation/Archives/archive-2015-08-14-08-18-15/Genes/genes.gtf -r /user/ref_genome/hg38_ucsc/Homo_sapiens/UCSC/hg38/Sequence/WholeGenomeFasta/genome.fa -s /user/bam/bam_samples.txt -o /user/bam/RNA-SeQC

Apologies for the length of the post but I wanted to get as much useful information in place.

If anyone has any suggestions or advice it would be greatly appreciated!

Thanks :)

RNA-Seq • 2.5k views

ADD COMMENT • link updated 3.8 years ago by Biostar 20 • written 7.5 years ago by nicholas.owen1 • 0

0

Entering edit mode

We love a lot of useful information in the first post, don't worry about it. Notice that the problem occurs for bwa, I'm not sure giving the java process a bigger heap space will make a difference for that one. Perhaps you could try to run bwa separately to try to isolate the problem?

ADD REPLY • link 7.5 years ago by WouterDeCoster 47k

0

Entering edit mode

Thanks for the advice, I have tried running BWA and everything is fine there, will look further into the memory allocated to java, thanks again

ADD REPLY • link 7.5 years ago by nicholas.owen1 • 0

0

Entering edit mode

This may be an obvious question but is that Java you are using 64-bit? Can you see if ulimit -a shows any limits on your account?

ADD REPLY • link 7.5 years ago by GenoMax 141k

0

Entering edit mode

It looks like you're running this on a cluster, can you send this to the cluster admin and ask him/her if there is an oddly small stack size limitation on some/all of the nodes (you don't need to know what that means)? My guess from the error message is that there's an odd limitation with that.

ADD REPLY • link 7.5 years ago by Devon Ryan 104k

0

Entering edit mode

Thanks, I did and have no resolution as yet from there, but will follow up next week, will update :)

ADD REPLY • link 7.5 years ago by nicholas.owen1 • 0