training genemarkS to call proteins from virus genomes
0
0
Entering edit mode
5.5 years ago

Hi all,

I am trying to perform self-training with GeneMarkS to improve protein calling from virus genomes and transcripts. Could someone tell me if it is correct what I am doing? First, I download eukaryotic viruses from NCBI Refseq to create a "matrix" using gmsn.pl:

/fs/project/PAS1117/modules/GeneMarkS/3.36/gmsn.pl -euk --name virusgroup1 --gm /virusgroup1_refseq_genomes.fasta

which generated (among many others) the following model files:

virusgroup1_gm_heuristic.mat virusgroup1_gm.mat virusgroup1_hmm_combined.mod virusgroup1_hmm_heuristic.mod virusgroup1_hmm.mod

then I used the one named "virusgroup1_gm.mat" to run genemark against a single virus genome (that belongs theoretically to group 1, so GeneMark should call correctly all its viral genes):

/fs/project/PAS1117/modules/GeneMarkS/3.36/gm -m group1_gm.mat -l o q -o p -r p -v NC_023420-2.fasta

nevertheless, I only get a file named "NC_023420-2.fasta.lst" with a few gene coordinates, BUT NO PROTEIN FILE (even having set the options for that):

List of Open reading frames predicted as CDSs, shown with alternate starts (regions from start to stop codon w/ coding function >0.50)

Left Right DNA Coding Avg Start end end Strand Frame Prob Prob


  42      4046  direct      fr 3   0.60  ....  
 195      4046  direct      fr 3   0.60  0.79  
 297      4046  direct      fr 3   0.60  0.17  
 333      4046  direct      fr 3   0.60  0.10  
 537      4046  direct      fr 3   0.61  0.06  
 570      4046  direct      fr 3   0.60  0.12

List of Regions of interest (regions from stop to stop codon w/ a signal in between)

LEnd REnd Strand Frame


   21      4046  direct      fr 3

Can you guess what is wrong?

Thanks in advanced, Guillermo

RNA-Seq genemark annotation gene calling • 948 views
ADD COMMENT

Login before adding your answer.

Traffic: 2569 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6