Question: Weird Tophat2 Error Message
0
gravatar for Nick
6.3 years ago by
Nick270
Spain
Nick270 wrote:

I am trying to run tophat2 on a set of SE reads:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008.fastq --min-intron-length 6 --max-insertion-length 3 --max-deletion-length 3 --b2-seed 73 --solexa1.3-quals --microexon-search --num-threads 32 --library-type fr-unstranded --no-coverage-search --GTF ../../../references/gtf/xen-trop/ens71/Xenopus_tropicalis.JGI_4.2.71.gtf

But I am getting an weird error message:

[2013-08-10 14:34:22] Beginning TopHat run (v2.0.9)
[2013-08-10 14:34:22] Checking for Bowtie
              Bowtie version:        2.1.0.0
[2013-08-10 14:34:22] Checking for Samtools
            Samtools version:        0.1.19.0
[2013-08-10 14:34:22] Checking for Bowtie index files (genome)..
[2013-08-10 14:34:22] Checking for reference FASTA file 
[2013-08-10 14:34:22] Generating SAM header for ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71
Traceback (most recent call last): File "/galaxy/software/tophat2/2.0.9/tophat", line 4072, in ?
sys.exit(main())
File "/galaxy/software/tophat2/2.0.9/tophat", line 3926, in main
params.read_params = check_reads_format(params, reads_list)
File "/galaxy/software/tophat2/2.0.9/tophat", line 1829, in check_reads_format
zf = ZReader(f_name, params)
File "/galaxy/software/tophat2/2.0.9/tophat", line 1782, in __init__
self.file=open(filename)
IOError: [Errno 2] No such file or directory: '--min-intron-length'

What is going on here? It seems tophat2 is looking for a file/directory when it encounters --min-intron-length yet I don't understand why. Can you help?

tophat2 • 3.5k views
ADD COMMENTlink modified 17 months ago by archana.bioinfo87180 • written 6.3 years ago by Nick270
1

Try specifying the command as suggested in the manual, i.e., "tophat2 [options] index reads". This sort of error is likely to occur depending on exactly how the tophat python script parses its input. It's likely that the parser takes the first option lacking a "--something" as the index and the next one as the left_reads fastq list. If there are then more, it probably just takes that as the right_reads fastq list and assumes you have paired input. This isn't technically a bug, then, since you're specifying the command incorrectly...though I think a change in the python code would be in order since this sort of error isn't going to be uncommon.

ADD REPLYlink written 6.3 years ago by Devon Ryan93k
0
gravatar for Nick
6.3 years ago by
Nick270
Spain
Nick270 wrote:

Interestingly, if I discard the optional arguments and submit just the two mandatory parameters (index and fastq file) than there is no error:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008.fastq

Also, if I add a second file (as in the case of a PE sample) than there is no error, either:

tophat2 ../../../references/genomes/xen-trop/ens71/xen-trop-4-2-71 CO1_CGATGT_L008_1.fastq CO1_CGATGT_L008_2.fastq --min-intron-length 6 --max-insertion-length 3 --max-deletion-length 3 --b2-seed 73 --solexa1.3-quals --microexon-search --num-threads 32 --library-type fr-unstranded --no-coverage-search --GTF ../../../references/gtf/xen-trop/ens71/Xenopus_tropicalis.JGI_4.2.71.gtf

So it is just the case of a SE sample with optional parameters which is causing the error. This is really bizarre. Any help is appreciated.

ADD COMMENTlink modified 6.3 years ago • written 6.3 years ago by Nick270
0
gravatar for Nick
6.3 years ago by
Nick270
Spain
Nick270 wrote:

Thank you, dpryan79 - this was, indeed, the problem. It is, actually, a very sneaky idiosyncrasy. I now realise that I have to re-analyse some samples I analysed in the past because tophat essentially, stops trying to make sense of the parameters which come after the second fastq file. In my case, in the past I used to ran tophat for PE samples in the following way:

tophat2 <index> <file1> <file2> <options>

I wasn't getting the error I reported here (this is the first time I am using tophat with a SE sample but I have been using it for nearly a year on PE samples) but none of the options were, it seems, taken into account. A very lame implementation if you ask me which is a shame as tophat seems otherwise a decent tool. I wonder how many other people have been tricked by this idiosyncrasy but haven't realised it yet.

ADD COMMENTlink written 6.3 years ago by Nick270
1

Yeah, I expect a number of people have been bitten by this. I might look more into the tophat python script and see if I can just submit a patch to either throw a warning in this case or simply deal with it properly (the tophat script itself processes the command line input in a few different functions, none of which simply use argparse, which means more coding and less robustness).

ADD REPLYlink written 6.3 years ago by Devon Ryan93k

I am also getting similar error.

[2018-07-09 10:58:19] Beginning TopHat run (v2.1.0)
-----------------------------------------------
[2018-07-09 10:58:19] Checking for Bowtie
                  Bowtie version:        2.2.6.0
[2018-07-09 10:58:19] Checking for Bowtie index files (genome)..
[2018-07-09 10:58:19] Checking for reference FASTA file
[2018-07-09 10:58:19] Generating SAM header for /home/archana87/bowtie2_index/hg19
[2018-07-09 10:58:56] Preparing reads
         left reads: min. length=48, max. length=48, 4366511 kept reads (9024 discarded)
[2018-07-09 11:00:04] Mapping left_kept_reads to genome hg19 with Bowtie2
[2018-07-09 11:08:08] Mapping left_kept_reads_seg1 to genome hg19 with Bowtie2 (1/2)
[2018-07-09 11:10:33] Mapping left_kept_reads_seg2 to genome hg19 with Bowtie2 (2/2)
[2018-07-09 11:12:57] Searching for junctions via segment mapping
[2018-07-09 11:23:15] Retrieving sequences for splices
[2018-07-09 14:45:43] Indexing splices
Traceback (most recent call last):
  File "/usr/lib/python2.7/logging/__init__.py", line 885, in emit
    self.flush()
  File "/usr/lib/python2.7/logging/__init__.py", line 845, in flush
    self.stream.flush()
IOError: [Errno 22] Invalid argument
Logged from file tophat, line 1224
Traceback (most recent call last):
  File "/usr/bin/tophat", line 4095, in <module>
    sys.exit(main())
  File "/usr/bin/tophat", line 4061, in main
    user_supplied_deletions)
  File "/usr/bin/tophat", line 3683, in spliced_alignment
    params.read_params.color)
  File "/usr/bin/tophat", line 2585, in build_juncs_index
    external_splices_out_prefix = build_juncs_bwt_index(is_bowtie2, external_splices_out_prefix, color)
  File "/usr/bin/tophat", line 2519, in build_juncs_bwt_index
    print >> run_log, " ".join(bowtie_build_cmd)
IOError: [Errno 22] Invalid argument

Any help is much appreciated. Thanks

ADD REPLYlink modified 17 months ago by WouterDeCoster42k • written 17 months ago by archana.bioinfo87180

I added code markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

You should know that the old 'Tuxedo' pipeline of Tophat(2) and Cufflinks is no longer the "advisable" tool for RNA-seq analysis. The software is deprecated/ in low maintenance and should be replaced by HISAT2, StringTie and ballgown. See this paper: Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. There are also other alternatives, including alignment with STAR and bbmap, or pseudo-alignment using salmon followed by DESEq2 or edgeR.

ADD REPLYlink written 17 months ago by WouterDeCoster42k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1380 users visited in the last hour