Question

Tophat Reporting output tracks, run out of memory, how to solve?

0

Entering edit mode

6.4 years ago

zoukai3412085 • 0

running log as follow:

[2017-11-27 14:14:14] Beginning TopHat run (v2.1.1)
-----------------------------------------------
[2017-11-27 14:14:14] Checking for Bowtie
          Bowtie version:    2.3.3.1
[2017-11-27 14:14:14] Checking for Bowtie index files (transcriptome)..
[2017-11-27 14:14:14] Checking for Bowtie index files (genome)..
[2017-11-27 14:14:14] Checking for reference FASTA file
[2017-11-27 14:14:14] Generating SAM header for Ginkgo_biloba
[2017-11-27 14:15:43] Reading known junctions from GTF file
[2017-11-27 14:15:44] Preparing reads
     left reads: min. length=150, max. length=150, 25625428 kept reads (0 discarded)
    right reads: min. length=150, max. length=150, 25625428 kept reads (0 discarded)
[2017-11-27 14:45:59] Using pre-built transcriptome data..
[2017-11-27 14:46:00] Mapping left_kept_reads to transcriptome known with Bowtie2 
[2017-11-27 15:08:09] Mapping right_kept_reads to transcriptome known with Bowtie2 
[2017-11-27 15:33:07] Resuming TopHat pipeline with unmapped reads
[2017-11-27 15:33:07] Mapping left_kept_reads.m2g_um to genome Ginkgo_biloba with Bowtie2 
[2017-11-27 15:56:42] Mapping left_kept_reads.m2g_um_seg1 to genome Ginkgo_biloba with Bowtie2 (1/6)
[2017-11-27 16:03:20] Mapping left_kept_reads.m2g_um_seg2 to genome Ginkgo_biloba with Bowtie2 (2/6)
[2017-11-27 16:10:41] Mapping left_kept_reads.m2g_um_seg3 to genome Ginkgo_biloba with Bowtie2 (3/6)
[2017-11-27 16:18:04] Mapping left_kept_reads.m2g_um_seg4 to genome Ginkgo_biloba with Bowtie2 (4/6)
[2017-11-27 16:25:23] Mapping left_kept_reads.m2g_um_seg5 to genome Ginkgo_biloba with Bowtie2 (5/6)
[2017-11-27 16:32:36] Mapping left_kept_reads.m2g_um_seg6 to genome Ginkgo_biloba with Bowtie2 (6/6)
[2017-11-27 16:39:58] Mapping right_kept_reads.m2g_um to genome Ginkgo_biloba with Bowtie2 
[2017-11-27 17:07:30] Mapping right_kept_reads.m2g_um_seg1 to genome Ginkgo_biloba with Bowtie2 (1/6)
[2017-11-27 17:15:21] Mapping right_kept_reads.m2g_um_seg2 to genome Ginkgo_biloba with Bowtie2 (2/6)
[2017-11-27 17:24:19] Mapping right_kept_reads.m2g_um_seg3 to genome Ginkgo_biloba with Bowtie2 (3/6)
[2017-11-27 17:32:52] Mapping right_kept_reads.m2g_um_seg4 to genome Ginkgo_biloba with Bowtie2 (4/6)
[2017-11-27 17:41:37] Mapping right_kept_reads.m2g_um_seg5 to genome Ginkgo_biloba with Bowtie2 (5/6)
[2017-11-27 17:50:28] Mapping right_kept_reads.m2g_um_seg6 to genome Ginkgo_biloba with Bowtie2 (6/6)
[2017-11-27 17:59:12] Retrieving sequences for splices
[2017-11-27 18:08:05] Indexing splices
Building a SMALL index
[2017-11-27 18:08:14] Mapping left_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/6)
[2017-11-27 18:08:43] Mapping left_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/6)
[2017-11-27 18:09:17] Mapping left_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/6)
[2017-11-27 18:09:50] Mapping left_kept_reads.m2g_um_seg4 to genome segment_juncs with Bowtie2 (4/6)
[2017-11-27 18:10:25] Mapping left_kept_reads.m2g_um_seg5 to genome segment_juncs with Bowtie2 (5/6)
[2017-11-27 18:11:00] Mapping left_kept_reads.m2g_um_seg6 to genome segment_juncs with Bowtie2 (6/6)
[2017-11-27 18:11:33] Joining segment hits
[2017-11-27 18:52:54] Mapping right_kept_reads.m2g_um_seg1 to genome segment_juncs with Bowtie2 (1/6)
[2017-11-27 18:53:28] Mapping right_kept_reads.m2g_um_seg2 to genome segment_juncs with Bowtie2 (2/6)
[2017-11-27 18:54:07] Mapping right_kept_reads.m2g_um_seg3 to genome segment_juncs with Bowtie2 (3/6)
[2017-11-27 18:54:45] Mapping right_kept_reads.m2g_um_seg4 to genome segment_juncs with Bowtie2 (4/6)
[2017-11-27 18:55:24] Mapping right_kept_reads.m2g_um_seg5 to genome segment_juncs with Bowtie2 (5/6)
[2017-11-27 18:56:05] Mapping right_kept_reads.m2g_um_seg6 to genome segment_juncs with Bowtie2 (6/6)
[2017-11-27 18:56:44] Joining segment hits
[2017-11-27 19:23:59] Reporting output tracks
    [FAILED]
Error running /home/bio-501/zoukai/software/tophat-2.1.1.Linux_x86_64/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir GLPC1/ --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p40 --inner-dist-mean 50 --inner-dist-std-dev 20 --gtf-annotations transcriptome_data/known.gff --gtf-juncs GLPC1/tmp/known.juncs --no-closure-search --no-coverage-search --no-microexon-search --sam-header GLPC1/tmp/Ginkgo_biloba_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/home/bio-501/zoukai/software/tophat-2.1.1.Linux_x86_64/samtools_0.1.18 --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 Ginkgo_biloba.fa GLPC1/junctions.bed GLPC1/insertions.bed GLPC1/deletions.bed GLPC1/fusions.out GLPC1/tmp/accepted_hits GLPC1/tmp/left_kept_reads.m2g.bam,GLPC1/tmp/left_kept_reads.m2g_um.mapped.bam,GLPC1/tmp/left_kept_reads.m2g_um.candidates GLPC1/tmp/left_kept_reads.bam GLPC1/tmp/right_kept_reads.m2g.bam,GLPC1/tmp/right_kept_reads.m2g_um.mapped.bam,GLPC1/tmp/right_kept_reads.m2g_um.candidates GLPC1/tmp/right_kept_reads.bam
Loaded 134533 junctions

I have searched the reason on internet, to know it is because it ran out of memory. When the process"Reporting output tracks" runs, it take to much memory, more than 512G.My reference genome is 9.98GB large , and two left and right fastq files is both 8.65GB. My linux server has 512GB memory. I set 40 threads. How can I solve this problem? Can reducing the threads have effect? or other methods ?

RNA-Seq tophat • 1.7k views

ADD COMMENT • link updated 6.4 years ago by WouterDeCoster 47k • written 6.4 years ago by zoukai3412085 • 0

1

Entering edit mode

Not an answer to your question, but you should know that the old 'Tuxedo' pipeline of Tophat and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

I also added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY • link 6.4 years ago by WouterDeCoster 47k