Question: maker has been running for a long time
0
gravatar for olechnwin
17 months ago by
olechnwin30
olechnwin30 wrote:

Hi all,

I am running genome annotation using MAKER. It's a human genome so about ~3Gb but with ~ 3500 contigs.

I am running MAKER with 20 processes (mpich). It's been running for 4 days on the same 19 contigs.

I cannot tell whether it is supposed to be this long or whether it is stuck on something.

I am running MAKER with custom repeat library. I ran it with these options (maker_opts.ctl):

#-----Genome (these are always required)
genome=/Data/A673_pacbio/sequel_fasta/4-quiver/cns_output/fc_phase_pipeline/output/cns_p.phased.0.fasta
#genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic

#-----EST Evidence (for best results provide a file for at least one)
est=/Data/A673_rnaseq/trinity_out_dir/Trinity.fasta

#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=/database/uniprot_sprot_human.fasta
#protein sequence file in fasta format (i.e. from mutiple oransisms)
protein_gff=  #aligned protein homology evidence from an external GFF3 file

#-----Repeat Masking (leave values blank to skip repeat masking)
model_org=all #select a model organism for RepBase masking in RepeatMasker
rmlib=/Data/A673_pacbio/sequel_fasta/4-quiver/cns_output/fc_phase_pipeline/output/cns_phased0_repeat/RM_23217.WedJul241739322019/consensi.fa.classified
#provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein=/opt/maker.4/maker/data/te_proteins.fasta #provide a fasta file of transposable element proteins for RepeatRunner

These are the last lines of maker stdout:

running  repeat masker.
#--------- command -------------#
Widget::RepeatMasker:
cd /gpfs0/scratch/1869571/maker_UKXIzP; /opt/RepeatMasker/RepeatMasker /maker_mpich_run/cns_p.phased.0.maker.output/cns_p.phased.0_datastore/CD/B9/000010F_0//theVoid.000010F_0/0/000010F_0.0.consensi%2Efa%2Eclassified.specific -dir /maker_mpich_run/cns_p.phased.0.maker.output/cns_p.phased.0_datastore/CD/B9/000010F_0//theVoid.000010F_0/0 -pa 1 -lib /Data/A673_pacbio/sequel_fasta/4-quiver/cns_output/fc_phase_pipeline/output/cns_phased0_repeat/RM_23217.WedJul241739322019/consensi.fa.classified
#-------------------------------#

Looking at the node where this is run, only one processes working hard ~100%CPU, the other maker only ran ~1-2 %CPU. If it is supposed to be this long, can someone suggest a way to speed up this process?

maker annotation genome • 498 views
ADD COMMENTlink modified 17 months ago • written 17 months ago by olechnwin30
1
gravatar for Charles Warden
17 months ago by
Charles Warden8.0k
Duarte, CA
Charles Warden8.0k wrote:

I recently encountered an issue following a MacOS upgrade such that running MAKER on a Linux VM ended up working better than on my Mac (even though it was previously OK).

I don't know if that is the same issue for you, but I unfortunately am not certain what is causing this specific issue.

MAKER has a contact for support, but it might be worth making clear that those requests will be make public in the Google Groups (so, I think you may just want to join that group and submit the question more directly):

https://groups.google.com/forum/#!forum/maker-devel

ADD COMMENTlink written 17 months ago by Charles Warden8.0k
1

Hi,

Thank you for taking the time to comment. I'm using MAKER in centOS Linux. I'll try to post in the google group too. I'm waiting for them to approve my subscription.

ADD REPLYlink modified 17 months ago • written 17 months ago by olechnwin30

Hmm - the website does say approval is required after sending a message to maker-devel@yandell-lab.org.

However, in the meantime, you can see all the other responses, if that is helpful (and, hopefully, you will be approved soon).

ADD REPLYlink written 17 months ago by Charles Warden8.0k
1

Sadly, I'm still not approved. I've sent two requests.

I tried running their example_01_basic using hsap_contig.fasta genome and it has been running for 20+ hours. I wonder if you have tried running their example? how long does it take? I'm running this example using mpi with 4 processes.

This is the last output/log so far:

Widget::blastx:  
/opt/miniconda3/bin/blastx -db /gpfs0/scratch/1895302/maker_kXmduG/te_proteins%2Efasta.mpi.10.9     -query /gpfs0/scratch/1895302/maker_kXmduG/2/NT_010783%2E15.1 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /maker_tutorial/example_01_basic/hsap_contig.maker.output/hsap_contig_datastore/80/99/NT_010783.15//theVoid.NT_010783.15/0/NT_010783%2E15.1.te_proteins%2Efasta.repeatrunner.temp_dir/te_proteins%2Efasta.mpi.10.9.repeatrunner
#-------------------------------#
deleted:5 hits
deleted:7 hits
collecting blastx repeatmasking
processing all repeats
in cluster::shadow_cluster...
...finished clustering.

Do you happen to have your completed log? I'd like to see what it looks like if it ran to completion.

ADD REPLYlink modified 17 months ago • written 17 months ago by olechnwin30
1

I am surprised that you weren't approved - perhaps there is just a general delay in the response time? Maybe you can contact the corresponding author to see if there is some issue with the approval system?

It's been a little while (although some of that content is in their archive).

At first, I thought minutes was the right run-time (since the program initially either froze or stopped within minutes). However, I think hours (or maybe 1 hour) ended up being the right runtime. However, that was for analysis of 1 contig at a time (and I think they were all less than 700,000 bp).

If you have a whole genome to annotate, I think you are going to have additional complications.

ADD REPLYlink modified 17 months ago • written 17 months ago by Charles Warden8.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1114 users visited in the last hour