Question

tophat2 Error: segment-based junction search failed with err =-9

1

Entering edit mode

9.1 years ago

marongiu.luigi ▴ 750

Dear all, I am aligning HISeq RNA-Seq data to the human reference genome. I downloaded the Homo_sapiens.GRCh38.dna.toplevel.fa.gz and the Homo_sapiens.GRCh38.84.gtf.gz reference files and I built the reference indices with the following:

bowtie2-build -f GRCh38.84.fa GRCh38.84
tophat2 -p 16 -G GRCh38.84.gtf --transcriptome-index=GRCh38.84.tr GRCh38.84

which gave me the following files: GRCh38.84.1.bt2l GRCh38.84.2.bt2l GRCh38.84.rev.1.bt2l GRCh38.84.3.bt2l GRCh38.84.rev.2.bt2l GRCh38.84.4.bt2l

and the folder GRCh38.84.tr which contains: GRCh38.84.1.bt2 GRCh38.84.4.bt2 GRCh38.84.gff GRCh38.84.ver GRCh38.84.2.bt2 GRCh38.84.fa GRCh38.84.rev.1.bt2 GRCh38.84.3.bt2 GRCh38.84.fa.tlst GRCh38.84.rev.2.bt2

I have two paired files seq_1.fastq and seq_2.fastq, I removed the TruSeq adapters with trimmomatic with:

java -jar /usr/bin/trimmomatic.jar PE -threads 16 -phred33 seq_1.fastq seq_2.fastq seq_1Paired seq_1Unpaired seq_2Paired seq_2Unpaired ILLUMINACLIP:./TruSeq.fa:2:30:10:1:true

then I ran the alignment with Tophat:

$ tophat2 -o outputFolder -G GRCh38.84.gtf --transcriptome-index=GRCh38.84.tr --no-coverage-search -p 16 GRCh38.84 seq_1Paired seq_2Paired

and I got this error -9 with the log:

[2016-05-15 08:08:59] Beginning TopHat run (v2.1.1)
-----------------------------------------------
[2016-05-15 08:08:59] Checking for Bowtie
          Bowtie version:    2.2.6.0
[2016-05-15 08:09:00] Checking for Bowtie index files (transcriptome)..
[2016-05-15 08:09:00] Checking for Bowtie index files (genome)..
[2016-05-15 08:09:00] Checking for reference FASTA file
[2016-05-15 08:09:00] Generating SAM header for GRCh38.84
[2016-05-15 08:11:02] Reading known junctions from GTF file
[2016-05-15 08:12:32] Preparing reads
     left reads: min. length=12, max. length=60, 26138268 kept reads (2188 discarded)
    right reads: min. length=12, max. length=60, 25981749 kept reads (158707 discarded)
Warning: short reads (<20bp) will make TopHat quite slow and take large amount of memory because they are likely to be mapped in too many places
[2016-05-15 08:42:52] Using pre-built transcriptome data..
[2016-05-15 08:43:11] Mapping left_kept_reads to transcriptome GRCh38.84 with Bowtie2 
[2016-05-15 12:56:28] Mapping right_kept_reads to transcriptome GRCh38.84 with Bowtie2 
[2016-05-15 17:28:33] Resuming TopHat pipeline with unmapped reads
[2016-05-15 17:28:33] Mapping left_kept_reads.m2g_um to genome GRCh38.84 with Bowtie2 
[2016-05-15 18:33:01] Mapping left_kept_reads.m2g_um_seg1 to genome GRCh38.84 with Bowtie2 (1/2)
[2016-05-15 18:40:20] Mapping left_kept_reads.m2g_um_seg2 to genome GRCh38.84 with Bowtie2 (2/2)
[2016-05-15 18:51:34] Mapping right_kept_reads.m2g_um to genome GRCh38.84 with Bowtie2 
[2016-05-15 19:53:43] Mapping right_kept_reads.m2g_um_seg1 to genome GRCh38.84 with Bowtie2 (1/2)
[2016-05-15 20:01:00] Mapping right_kept_reads.m2g_um_seg2 to genome GRCh38.84 with Bowtie2 (2/2)
[2016-05-15 20:13:22] Searching for junctions via segment mapping
    [FAILED]
Error: segment-based junction search failed with err =-9
Loading left segment hits...

The files I obtained are in the folders logs and tmp: ./logs$ ls bowtie.left_kept_reads.log m2g_left_kept_reads.err bowtie.left_kept_reads.m2g_um.log m2g_left_kept_reads.out bowtie.left_kept_reads.m2g_um_seg1.log m2g_right_kept_reads.err bowtie.left_kept_reads.m2g_um_seg2.log m2g_right_kept_reads.out bowtie.right_kept_reads.log prep_reads.log bowtie.right_kept_reads.m2g_um.log run.log bowtie.right_kept_reads.m2g_um_seg1.log segment_juncs.log bowtie.right_kept_reads.m2g_um_seg2.log tophat.log gtf_juncs.log

./tmp$ ls GRCh38.84.bwt.samheader.sam GRCh38.84_genome.bwt.samheader.sam GRCh38.juncs left_kept_reads.bam left_kept_reads.bam.index left_kept_reads.m2g.bam left_kept_reads.m2g.bam.index left_kept_reads.m2g_um.bam left_kept_reads.m2g_um.bam.index left_kept_reads.m2g_um.mapped.bam left_kept_reads.m2g_um.mapped.bam.index left_kept_reads.m2g_um_seg1.bam left_kept_reads.m2g_um_seg1.bam.index left_kept_reads.m2g_um_seg1.fq.z left_kept_reads.m2g_um_seg1_unmapped.bam left_kept_reads.m2g_um_seg1_unmapped.bam.index left_kept_reads.m2g_um_seg2.bam left_kept_reads.m2g_um_seg2.bam.index left_kept_reads.m2g_um_seg2.fq.z left_kept_reads.m2g_um_seg2_unmapped.bam left_kept_reads.m2g_um_seg2_unmapped.bam.index left_kept_reads.m2g_um_unmapped.bam left_kept_reads.m2g_um_unmapped.bam.index right_kept_reads.bam right_kept_reads.bam.index right_kept_reads.m2g.bam right_kept_reads.m2g.bam.index right_kept_reads.m2g_um.bam right_kept_reads.m2g_um.bam.index right_kept_reads.m2g_um.mapped.bam right_kept_reads.m2g_um.mapped.bam.index right_kept_reads.m2g_um_seg1.bam right_kept_reads.m2g_um_seg1.bam.index right_kept_reads.m2g_um_seg1.fq.z right_kept_reads.m2g_um_seg1_unmapped.bam right_kept_reads.m2g_um_seg1_unmapped.bam.index right_kept_reads.m2g_um_seg2.bam right_kept_reads.m2g_um_seg2.bam.index right_kept_reads.m2g_um_seg2.fq.z right_kept_reads.m2g_um_seg2_unmapped.bam right_kept_reads.m2g_um_seg2_unmapped.bam.index right_kept_reads.m2g_um_unmapped.bam right_kept_reads.m2g_um_unmapped.bam.index segment.deletions segment.fusions segment.insertions segment.juncs temp.samheader.sam

Could you please tell me what I got wrong and how can I fix it?

Many thanks,

Luigi

alignment RNA-Seq • 4.6k views

ADD COMMENT • link updated 8.6 years ago by jesselee516 ▴ 100 • written 9.1 years ago by marongiu.luigi ▴ 750

0

Entering edit mode

Dear Luigi, I get the same error. Have you made any progress with understanding what leads to it? Thanks, N

ADD REPLY • link 9.1 years ago by NK • 0

0

Entering edit mode

I also obtained this error on one of my files, were you able to figure out why this was?

ADD REPLY • link 8.9 years ago by rborgesm ▴ 10

0

Entering edit mode

nope, I did not solve it...

ADD REPLY • link 8.7 years ago by marongiu.luigi ▴ 750

score 0 · Answer 1 · 2016-12-18

0

Entering edit mode

8.6 years ago

jesselee516 ▴ 100

It may be caused by running out of memory.. You can try one cell line each time with multiple processor, instead of running multiple cell lines at the same time.

ADD COMMENT • link 8.6 years ago by jesselee516 ▴ 100