Getting Protein Sequence From Glimmer Output
2
0
Entering edit mode
10.4 years ago

Hello All,

I am trying to use glimmer 3.2 for predicting genes in bacterial genome. I want to include glimmer into an automated analysis pipeline. After running glimmer I found that the program only predicts and output the gene coordinates but do not produce any fasta file containing gene or protein sequence. Although I can extract gene from genome based on coordinate information by writing a script. I was thinking if there is any module in perl which is built for doing this work. I know there is a module for parsing glimmer output in bioperl but not sure if its used for extracting gene or proteins as well. Please suggest me if there are any module already built for this work or any other easy way.

Thank you very much

Shalabh

• 5.1k views
ADD COMMENT
0
Entering edit mode
10.4 years ago
Pavel Senin ★ 1.9k

I think that Biopython is what you want with code samples here, in addition, for a bacterial genome, as an alternative or in order to double check you can use Prodigal, that outputs a set of complete and incomplete ORFs in nucleotide FASTA along with their translations.

ADD COMMENT
0
Entering edit mode

Thank you very much for your suggestion Pavel. I have included Prodigal in my pipeline to verify the results from glimmer.

ADD REPLY
0
Entering edit mode
10.4 years ago

I guess i figured out the way to extract gene sequence by using multi-extract program on the .predict file output by glimmer3. However I'm bit confused by the information in .predict file. For few genes the start co-ordinate is higher than stop co-ordinate but the strand is positive. for e.g

ID Contig Start Stop Strand/Frame Score orf00001 contig_1202 11 3 +1 16.97

When I used multi-extract to fetch this region between the co-ordinate. The program fetched gene region starting from 11th base till end of the contig making the total length of gene 234 bases. Also the program automatically added the bases for stop codon at the end of gene which are actually missing from the end of the contig.

I'm not sure if there is any error in prediction by glimmer or what does this kind of output means?

Can any one suggest me why glimmer output such predictions?

Thank you

ADD COMMENT

Login before adding your answer.

Traffic: 2423 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6