Question: Functional Annotations After GeneMark-ES
gravatar for brittanymlebert
9 months ago by
brittanymlebert10 wrote:

This may be a stupid question, but I am very new to bioinformatics. I am trying to annotate a novel fungal genome. I just ran it through GeneMark-ES to annotate it and got a .gtf output. When I put this file into Geneious it attaches to my FASTA file and shows me all of the introns and exons on the sequences. My question is, how do I go from this to getting functional annotations? I downloaded Blast2Go and it looks like it only needs the FASTA file to run. If this is true, then why did I need to generate the .gtf file? How can it give me functional annotations without it? Thank you so much in advance!

genome • 281 views
ADD COMMENTlink modified 9 months ago by lieven.sterck9.4k • written 9 months ago by brittanymlebert10
gravatar for lieven.sterck
9 months ago by
VIB, Ghent, Belgium
lieven.sterck9.4k wrote:

you might be mixing up a few (genome annotation) concepts here.

the fasta files hold the actual sequences of either your genome (== your initial input fasta file for genemark for instance) and on the other hand CDS, proteins, .... and all other possible sequences.

The annotation process of GeneMark is there to provide you with a location of genes on your genomic sequence. These are typically provided in formats such as GFF, EMBl, GTF ... . These only (or should in theory) contain coordinates of features how to are present on your genomic sequence. (== also called structural annotation)

By combining both of them you can extract from your genomic sequence the actual sequences of genes/CDS which then can be translated into proteins.

Tools for functional annotation (such as Blast2GO) use the protein sequences you predicted to analyse and assign potential functions to them (== functional annotation)

ADD COMMENTlink written 9 months ago by lieven.sterck9.4k

Thank you for your response! I see what you mean about the combination of the fasta and the .gtf, as there are now automatic translations on Geneious under my sequences. What I'm still confused about it what file I put into Blast2Go, because it is just asking for a fasta file of actual sequences. Where would the protein sequences come into play if it just needs a fasta file to run?

ADD REPLYlink written 9 months ago by brittanymlebert10

blast2GO needs a fasta file with protein sequence in it.

fasta files can contain any kind of sequence, not only DNA sequences. So you need to generate a fasta file with the protein translations of your predicted genes and put those through blast2GO .

ADD REPLYlink written 9 months ago by lieven.sterck9.4k

Thank you very much! I understand now. I have been researching a simple way to just get a fasta of the protein translations from a genomic sequence fasta and a .gtf file. Do you have any suggestions?

ADD REPLYlink written 9 months ago by brittanymlebert10

what did you found so far?

bedtools getfasta must have passed the search results, no?

ADD REPLYlink modified 9 months ago • written 9 months ago by lieven.sterck9.4k

I actually found a way to just download the translated sequences from Geneious, which was really easy. Thank you for all of your help. I really appreciate it!

ADD REPLYlink written 9 months ago by brittanymlebert10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1714 users visited in the last hour