Question: Cufferge IOError: Errno 2 no such file or directory
0
gravatar for Yuka Takemon
3.9 years ago by
Yuka Takemon20
Canada/Vancouver/GenomeSciencesCentre
Yuka Takemon20 wrote:

Hello,

The following is my script:

 

#!/bin/bash -l

#PBS -l nodes=1:ppn=20,walltime=24:00:00

module load cufflinks/2.2.1

##define directories and other variables

#directory to input cufflinks

dir_cufflinks=/pathto/Annotation/cufflinks_preqc

#dir to reference annotation

dir_ref_annotation=/pathto/genome_index/Mus_musculus/UCSC/mm10/Annotation/Genes

#dir to DNA seq for reference

dir_ref_genome_seq=/pathto/genome_index/Mus_musculus/UCSC/mm10/Sequence/Chromosomes

cuffmerge -o ${dir_cufflinks}/cuffmerge -g ${dir_ref_annotation}/genes.gtf -s ${dir_ref_genome_seq}/*.fa -p 20 ${dir_cufflinks}/all_transcripts.txt 

I want to note that:

all_transcripts.txt lists .gtf files that came out of cufflinks 

-g genes.gtf was previously used with cufflinks as a reference annotation

-s contains individual Chr*.fa  files

however I am getting the following error:

[Mon Dec 14 17:33:45 2015] Beginning transcriptome assembly merge

-------------------------------------------

[Mon Dec 14 17:33:45 2015] Preparing output location /pathto/Annotation/cufflinks_preqc/cuffmerge/

Traceback (most recent call last):

  File "/opt/compsci/cufflinks/2.2.1/cuffmerge", line 580, in <module>

    sys.exit(main())

  File "/opt/compsci/cufflinks/2.2.1/cuffmerge", line 538, in main

    gtf_input_files = test_input_files(transfrag_list_file)

  File "/opt/compsci/cufflinks/2.2.1/cuffmerge", line 268, in test_input_files

    g = open(line,"r")

IOError: [Errno 2] No such file or directory: '>chr11'

Is there something obvious I am missing here? Any input/help is appretiated

 

rna-seq errno 2 cuffmerge • 2.7k views
ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by Yuka Takemon20
3
gravatar for Dan Gaston
3.9 years ago by
Dan Gaston7.1k
Canada
Dan Gaston7.1k wrote:

I believe usually with cuffmerge when passing a directory for the reference (-s) you just pass the directory name. You  are passing /path/to/ref_dir/*.fa. Cuffmerge won't try and expand that to the various fasta files. Just pass -s /path/to/ref_dir/ as the parameter.

ADD COMMENTlink written 3.9 years ago by Dan Gaston7.1k

Thanks Dan! It looks like it ran, with a new merged.gtf file. But i'm currious if you can help me understand the warning I got:

Warning: cannot find genomic sequence file /pathto/Mus_musculus/UCSC/mm10/Sequence/Chromosomes/chr1_GL456221_random{.fa,.fasta}

for each chromosome, each with /chrX_GLXXXX_random{.fa,.fasta} suffix. Is this something I can ignore? 

ADD REPLYlink modified 3.9 years ago • written 3.9 years ago by Yuka Takemon20

Do you have the those contigs in the directory as FASTA files? My guess would be that they are in your transcriptome GTF file but that you don't have FASTA sequences for them in your directory. Its probably best to have them, although in my experience there isn't much in the way of known genes/protein-coding transcripts on these contigs so it probably won't have a huge impact on your analysis but it is always better to be more complete.

ADD REPLYlink written 3.9 years ago by Dan Gaston7.1k

I dug around some more and noticed that the chrX_GLXX_random appears in my .gtf that came out of cufflinks, but not in the reference .gtf. So this is most likely the issue I am having here. 

Thanks for you help!

ADD REPLYlink written 3.9 years ago by Yuka Takemon20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2096 users visited in the last hour