Question: TopHat Failed In 'Reporting output tracks' Stage
1
gravatar for ivan.molodtsov
3.0 years ago by
Russia, Moscow
ivan.molodtsov10 wrote:

Hello everyone!

I've encountered the following problem while doing study on EncodeProject data, maybe someone here would be able to give me an advice!

I've downloaded raw sequencing data for library ENCLB555APY from https://www.encodeproject.org/experiments/ENCSR000CPY/ and tried to map it on human genome downloaded from ftp://igenome:G3nom3s4u@ussd-ftp.illumina.com/Homo_sapiens/UCSC/hg38/Homo_sapiens_UCSC_hg38.tar.gz.

Afterwards I've tried to use TopHat v2.1.0 as following

tophat2 -p 8 --b2-very-sensitive -o tophat_res/ENCLB555APY/ Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome  ENCFF000HGG.fastq.gz ENCFF000HHF.fastq.gz

and it failed with the following tophat.log with Error which I failed to Google:

[2016-12-03 17:27:10] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2016-12-03 17:27:10] Checking for Bowtie
                  Bowtie version:        2.2.6.0
[2016-12-03 17:27:10] Checking for Bowtie index files (genome)..
[2016-12-03 17:27:10] Checking for reference FASTA file
[2016-12-03 17:27:10] Generating SAM header for Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome
[2016-12-03 17:27:12] Preparing reads
         left reads: min. length=76, max. length=76, 133266131 kept reads (283399 discarded)
        right reads: min. length=76, max. length=76, 133088200 kept reads (461330 discarded)
[2016-12-03 18:28:22] Mapping left_kept_reads to genome genome with Bowtie2
[2016-12-04 03:31:38] Mapping left_kept_reads_seg1 to genome genome with Bowtie2 (1/3)
[2016-12-04 03:42:39] Mapping left_kept_reads_seg2 to genome genome with Bowtie2 (2/3)
[2016-12-04 03:53:31] Mapping left_kept_reads_seg3 to genome genome with Bowtie2 (3/3)
[2016-12-04 04:07:24] Mapping right_kept_reads to genome genome with Bowtie2
[2016-12-04 12:42:21] Mapping right_kept_reads_seg1 to genome genome with Bowtie2 (1/3)
[2016-12-04 13:04:21] Mapping right_kept_reads_seg2 to genome genome with Bowtie2 (2/3)
[2016-12-04 13:24:26] Mapping right_kept_reads_seg3 to genome genome with Bowtie2 (3/3)
[2016-12-04 13:48:17] Searching for junctions via segment mapping
[2016-12-04 14:04:52] Retrieving sequences for splices
[2016-12-04 14:06:20] Indexing splices
[2016-12-04 14:06:53] Mapping left_kept_reads_seg1 to genome segment_juncs with Bowtie2 (1/3)
[2016-12-04 14:10:07] Mapping left_kept_reads_seg2 to genome segment_juncs with Bowtie2 (2/3)
[2016-12-04 14:13:25] Mapping left_kept_reads_seg3 to genome segment_juncs with Bowtie2 (3/3)
[2016-12-04 14:16:18] Joining segment hits
[2016-12-04 14:20:55] Mapping right_kept_reads_seg1 to genome segment_juncs with Bowtie2 (1/3)
[2016-12-04 14:29:53] Mapping right_kept_reads_seg2 to genome segment_juncs with Bowtie2 (2/3)
[2016-12-04 14:36:11] Mapping right_kept_reads_seg3 to genome segment_juncs with Bowtie2 (3/3)
[2016-12-04 14:40:37] Joining segment hits
[2016-12-04 14:47:37] Reporting output tracks
        [FAILED]
Error running /usr/bin/tophat_reports --min-anchor 8 --splice-mismatches 0 --min-report-intron 50 --max-report-intron 500000 --min-isoform-fraction 0.15 --output-dir tophat_res/ENCLB555APY// --max-multihits 20 --max-seg-multihits 40 --segment-length 25 --segment-mismatches 2 --min-closure-exon 100 --min-closure-intron 50 --max-closure-intron 5000 --min-coverage-intron 50 --max-coverage-intron 20000 --min-segment-intron 50 --max-segment-intron 500000 --read-mismatches 2 --read-gap-length 2 --read-edit-dist 2 --read-realign-edit-dist 3 --max-insertion-length 3 --max-deletion-length 3 -z gzip -p8 --inner-dist-mean 50 --inner-dist-std-dev 20 --no-closure-search --no-coverage-search --no-microexon-search --sam-header tophat_res/ENCLB555APY//tmp/genome_genome.bwt.samheader.sam --report-discordant-pair-alignments --report-mixed-alignments --samtools=/usr/bin/samtools_0.1.18 --bowtie2-max-penalty 6 --bowtie2-min-penalty 2 --bowtie2-penalty-for-N 1 --bowtie2-read-gap-open 5 --bowtie2-read-gap-cont 3 --bowtie2-ref-gap-open 5 --bowtie2-ref-gap-cont 3 Homo_sapiens/UCSC/hg38/Sequence/Bowtie2Index/genome.fa tophat_res/ENCLB555APY//junctions.bed tophat_res/ENCLB555APY//insertions.bed tophat_res/ENCLB555APY//deletions.bed tophat_res/ENCLB555APY//fusions.out tophat_res/ENCLB555APY//tmp/accepted_hits tophat_res/ENCLB555APY//tmp/left_kept_reads.mapped.bam,tophat_res/ENCLB555APY//tmp/left_kept_reads.candidates tophat_res/ENCLB555APY//tmp/left_kept_reads.bam tophat_res/ENCLB555APY//tmp/right_kept_reads.mapped.bam,tophat_res/ENCLB555APY//tmp/right_kept_reads.candidates tophat_res/ENCLB555APY//tmp/right_kept_reads.bam
Error: failed to retrieve right read for pair # 2037377 !

It looks like some error in input files but I would think it to be highly improbable. So what have I done wrong and is there any way to overcome this error without re-running tophat?

Thanks in advance,

Ivan

ADD COMMENTlink modified 23 months ago by zuolin.bai0 • written 3.0 years ago by ivan.molodtsov10

Hello Ivan,

I got the same error. Could you please let me know how to fix it or what is the cause?

It is appreciated.

Zuolin Bai

ADD REPLYlink written 23 months ago by zuolin.bai0

You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. (If you can't get access to that publication, let me know and I'll -cough- help you.) There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon.

ADD REPLYlink written 23 months ago by WouterDeCoster42k
1
gravatar for mastal511
3.0 years ago by
mastal5112.0k
mastal5112.0k wrote:

I've had tophat fail at the 'Reporting output tracks' stage because it ran out of memory, but I didn't get the error message you're getting.

ADD COMMENTlink written 3.0 years ago by mastal5112.0k

Thanks for response! It was a machine with 32Gb RAM and app. 900 Gb of free storage space, so I wouldn't expect any kind of memory problems. The 'failed to retrieve right read for pair' mistake puzzles me as well

ADD REPLYlink written 3.0 years ago by ivan.molodtsov10

Actually 32 Gb is not that much, seeing that you have 133 million read pairs. Check how much memory it is using while it runs.

ADD REPLYlink written 3.0 years ago by mastal5112.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 800 users visited in the last hour