Question

Weird Tophat2 Error Message

0

Entering edit mode

11.9 years ago

Nick ▴ 290

I am trying to run tophat2 on a set of SE reads:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008.fastq --min-intron-length 6 --max-insertion-length 3 --max-deletion-length 3 --b2-seed 73 --solexa1.3-quals --microexon-search --num-threads 32 --library-type fr-unstranded --no-coverage-search --GTF ../../../references/gtf/xen-trop/ens71/Xenopus_tropicalis.JGI_4.2.71.gtf

But I am getting an weird error message:

[2013-08-10 14:34:22] Beginning TopHat run (v2.0.9)
[2013-08-10 14:34:22] Checking for Bowtie
              Bowtie version:        2.1.0.0
[2013-08-10 14:34:22] Checking for Samtools
            Samtools version:        0.1.19.0
[2013-08-10 14:34:22] Checking for Bowtie index files (genome)..
[2013-08-10 14:34:22] Checking for reference FASTA file 
[2013-08-10 14:34:22] Generating SAM header for ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71
Traceback (most recent call last): File "/galaxy/software/tophat2/2.0.9/tophat", line 4072, in ?
sys.exit(main())
File "/galaxy/software/tophat2/2.0.9/tophat", line 3926, in main
params.read_params = check_reads_format(params, reads_list)
File "/galaxy/software/tophat2/2.0.9/tophat", line 1829, in check_reads_format
zf = ZReader(f_name, params)
File "/galaxy/software/tophat2/2.0.9/tophat", line 1782, in __init__
self.file=open(filename)
IOError: [Errno 2] No such file or directory: '--min-intron-length'

What is going on here? It seems tophat2 is looking for a file/directory when it encounters --min-intron-length yet I don't understand why. Can you help?

tophat2 • 5.4k views

ADD COMMENT • link updated 4.3 years ago by Biostar 20 • written 11.9 years ago by Nick ▴ 290

1

Entering edit mode

Try specifying the command as suggested in the manual, i.e., "tophat2 [options] index reads". This sort of error is likely to occur depending on exactly how the tophat python script parses its input. It's likely that the parser takes the first option lacking a "--something" as the index and the next one as the left_reads fastq list. If there are then more, it probably just takes that as the right_reads fastq list and assumes you have paired input. This isn't technically a bug, then, since you're specifying the command incorrectly...though I think a change in the python code would be in order since this sort of error isn't going to be uncommon.

ADD REPLY • link 11.9 years ago by Devon Ryan 105k

score 0 · Answer 1 · 2013-08-11

Interestingly, if I discard the optional arguments and submit just the two mandatory parameters (index and fastq file) than there is no error:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008.fastq

Also, if I add a second file (as in the case of a PE sample) than there is no error, either:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008_1.fastq CO1_CGATGT_L008_2.fastq --min-intron-length 6 --max-insertion-length 3 --max-deletion-length 3 --b2-seed 73 --solexa1.3-quals --microexon-search --num-threads 32 --library-type fr-unstranded --no-coverage-search --GTF ../../../references/gtf/xen-trop/ens71/Xenopus_tropicalis.JGI_4.2.71.gtf

So it is just the case of a SE sample with optional parameters which is causing the error. This is really bizarre. Any help is appreciated.

WouterDeCoster · Answer 2 · 2013-08-12

0

Entering edit mode

11.9 years ago

Nick ▴ 290

Thank you, dpryan79 - this was, indeed, the problem. It is, actually, a very sneaky idiosyncrasy. I now realise that I have to re-analyse some samples I analysed in the past because tophat essentially, stops trying to make sense of the parameters which come after the second fastq file. In my case, in the past I used to ran tophat for PE samples in the following way:

tophat2 <index> <file1> <file2> <options>

I wasn't getting the error I reported here (this is the first time I am using tophat with a SE sample but I have been using it for nearly a year on PE samples) but none of the options were, it seems, taken into account. A very lame implementation if you ask me which is a shame as tophat seems otherwise a decent tool. I wonder how many other people have been tricked by this idiosyncrasy but haven't realised it yet.

ADD COMMENT • link 11.9 years ago by Nick ▴ 290

1

Entering edit mode

Yeah, I expect a number of people have been bitten by this. I might look more into the tophat python script and see if I can just submit a patch to either throw a warning in this case or simply deal with it properly (the tophat script itself processes the command line input in a few different functions, none of which simply use argparse, which means more coding and less robustness).

ADD REPLY • link 11.9 years ago by Devon Ryan 105k

0

Entering edit mode

I am also getting similar error.

[2018-07-09 10:58:19] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2018-07-09 10:58:19] Checking for Bowtie
                  Bowtie version:        2.2.6.0
[2018-07-09 10:58:19] Checking for Bowtie index files (genome)..
[2018-07-09 10:58:19] Checking for reference FASTA file
[2018-07-09 10:58:19] Generating SAM header for /home/archana87/bowtie2_index/hg19
[2018-07-09 10:58:56] Preparing reads
         left reads: min. length=48, max. length=48, 4366511 kept reads (9024 discarded)
[2018-07-09 11:00:04] Mapping left_kept_reads to genome hg19 with Bowtie2
[2018-07-09 11:08:08] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie2 (1/2)
[2018-07-09 11:10:33] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie2 (2/2)
[2018-07-09 11:12:57] Searching for junctions via segment mapping
[2018-07-09 11:23:15] Retrieving sequences for splices
[2018-07-09 14:45:43] Indexing splices
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 885, in emit
    self.flush()
  File "/usr/lib/python2.7/logging/__init__.py", line 845, in flush
    self.stream.flush()
IOError: [Errno 22] Invalid argument
Logged from file tophat, line 1224
Traceback (most recent call last):
  File "/usr/bin/tophat", line 4095, in <module>
    sys.exit(main())
  File "/usr/bin/tophat", line 4061, in main
    user_supplied_deletions)
  File "/usr/bin/tophat", line 3683, in spliced_alignment
    params.read_params.color)
  File "/usr/bin/tophat", line 2585, in build_juncs_index
    external_splices_out_prefix = build_juncs_bwt_index(is_bowtie2, external_splices_out_prefix, color)
  File "/usr/bin/tophat", line 2519, in build_juncs_bwt_index
    print >> run_log, " ".join(bowtie_build_cmd)
IOError: [Errno 22] Invalid argument

Any help is much appreciated. Thanks

ADD REPLY • link updated 7.0 years ago by WouterDeCoster 48k • written 7.0 years ago by archana.bioinfo87 ▴ 210

0

Entering edit mode

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon followed by DESEq2 or edgeR.

Please stop using Tophat https://t.co/Es4ohxOEyx Cole and I developed the method in *2008*. It was greatly improved in TopHat2 then HISAT & HISAT2. There is no reason to use it anymore. I have been saying this for years yet it has more citations this year than last #methodsmatter
— Lior Pachter (@lpachter) December 2, 2017

ADD REPLY • link 7.0 years ago by WouterDeCoster 48k