Hello,
I am hoping to use maker on a small cluster (~6 compute nodes) to annotate a fairly fragmented de novo assembly that has some longer contigs. We have maker installed, but so far even though every program runs, RepeatMasker seems to be the only program finding matches. Namely, blastx and exonerate don't find any alignment matches even though they seem to be set up correctly in the maker control file.
What I was wondering was whether this is an artifact of the fragmented assembly or some sort of setup error? I find the former hard to believe considering I got at least 2-3 blast hits for each longer contig in the entire assembly using galaxy megablast. I think the error lies in the fact that I get 0 hits, but I am not sure why:
Widget::blastx:
/usr/bin/blastx -db /tmp/maker_sHnU1b/chickenproteomeuniprot%2Efasta.mpi.10.9 -query /tmp/maker_sHnU1b/0/scaffold_1035.0 -num_alignments 10000 -num_descriptions 10000 -evalue 1e-06 -dbsize 300 -searchsp 500000000 -num_threads 1 -seg yes -soft_masking true -lcase_masking -show_gis -out /home/zgayk/MakerExample2/Gaviaimmerheader.maker.output/Gaviaimmerheader_datastore/38/7C/scaffold_1035//theVoid.scaffold_1035/0/scaffold_1035.0.chickenproteomeuniprot%2Efasta.blastx.temp_dir/chickenproteomeuniprot%2Efasta.mpi.10.9.blastx
#-------------------------------#
deleted:0 hits
collecting blastx reports
flattening protein clusters
prepare section files
processing the chunk divide
preparing evidence clusters for annotations
Preparing evidence for hint based annotation
clustering transcripts into genes for annotations
Processing transcripts into genes
choosing best annotation set
Choosing best annotations
processing chunk output
processing contig output
examining contents of the fasta file and run log
Essentially each .gff file produced for each contig is empty. If anyone knew how to fix this, I would be very appreciative.
Zach Gayk
Could you tell us
- What is your N50 ?
- What did you fill for the min_contig parameter in the maker_opts.ctl ?
- What kind of proteins (database ?) do you try to align on your genome ?
- Which kind of genome do you try to annotate ? Bird ? Fungi ?
As specified in the "maker_opts.ctl", under 10kb try to annotate a sequence is often useless.