Question: GeneMark-ES exits with "error, file not found: info/training.fna"
0
gravatar for gauravdube007
21 months ago by
India
gauravdube0070 wrote:

Dear All, I am trying to use GeneMark-ES Suite 4.32 to predict genes from a fungal genome. But GeneMark exits with:

error, file not found: info/training.fna

To run the program, I am using the following command:

./gmes_petap.pl --ES --fungus --sequence CBS_contigs.fasta.masked

The log file 'gmes.log' contains the following:

gmes_petap.pl : [Mon Jul 17 14:19:04 2017] /home/gaurav/2_GeneMark_results/gmes_petap/probuild --reformat_fasta --uppercase --allow_x --letters_per_line 60 --out data/dna.fna --label _dna --trace info/dna.trace --in /home/gaurav/1_Masking_Genome/CBS/CBS_contigs.fasta.masked

gmes_petap.pl : [Mon Jul 17 14:19:04 2017] /home/gaurav/2_GeneMark_results/gmes_petap/probuild --seq data/dna.fna --allow_x --stat info/dna.general

gmes_petap.pl : [Mon Jul 17 14:19:05 2017] /home/gaurav/2_GeneMark_results/gmes_petap/probuild --seq data/dna.fna --allow_x --stat_fasta info/dna.multi_fasta

gmes_petap.pl : [Mon Jul 17 14:19:05 2017] /home/gaurav/2_GeneMark_results/gmes_petap/probuild --seq data/dna.fna --allow_x --substring_n_distr info/dna.gap_distr

gmes_petap.pl : [Mon Jul 17 14:19:06 2017] /home/gaurav/2_GeneMark_results/gmes_petap/gc_distr.pl --in data/dna.fna --out info/dna.gc.csv --w 1000,8000

gmes_petap.pl : [Mon Jul 17 14:19:06 2017] /home/gaurav/2_GeneMark_results/gmes_petap/probuild --seq /home/gaurav/2_GeneMark_results/gmes_petap/data/dna.fna --split dna.fa --max_contig 5000000 --min_contig 50000 --letters_per_line 100 --split_at_n 5000 --split_at_x 5000 --allow_x --x_to_n --trace ../../info/training.trace

I have tried troubleshooting this error, but it did not resolved. Please help me resolve this error. Let me know if you need any further information. (I have even configured GeneMark-ET with the Braker pipeline, it works fantastic there. But I don't know what's the problem with GeneMark-ES over here.) Any help is appreciated. Thanks in advance.

ADD COMMENTlink modified 7 weeks ago by victorcana19910 • written 21 months ago by gauravdube0070

Do you have info/training.fna in the directory from where you are running the prog?

ADD REPLYlink written 21 months ago by Santosh Anand4.7k

Hi Santosh, I don't have training.fna in the 'info' directory. The following files are present in the info/ directory: training.trace, dna.trace, dna.multi_fasta, dna.general, dna.gc.csv, dna.gap_distr.

Please find my update below.

ADD REPLYlink modified 21 months ago • written 21 months ago by gauravdube0070

Update: After posting, I have used the same command on another assembly which has better genome statistics than the previous one and it worked !!! It seems the problem is with 'probuild', as it requires atleast 10Mb of good data for training (ref: [https://www.researchgate.net/post/Genemark-ES_error]), which could not be retrieved from the poor assemblies. So now the question is what tool should I use to get the best out of even the poor assemblies. Thanks all.

ADD REPLYlink written 21 months ago by gauravdube0070
1

You might want to open a new question, and reference this one.

ADD REPLYlink written 21 months ago by st.ph.n2.4k

Agree with st.ph.n. Open a new thread or change the subject line according to the new knowledge

ADD REPLYlink written 21 months ago by Santosh Anand4.7k
0
gravatar for victorcana1991
7 weeks ago by
victorcana19910 wrote:

I also had the same error. There are two solutions. Change the perl path to all .pl files or use the command change_path_in_perl_scripts.pl

I did it in the following way: victorc: ~ / bin / gm_et_linux_64 / gmes_petap $ ./change_path_in_perl_scripts.pl/home/victor/bin/perl done

ADD COMMENTlink written 7 weeks ago by victorcana19910
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 950 users visited in the last hour