Functional Annotations After GeneMark-ES
1
0
Entering edit mode
4.1 years ago

This may be a stupid question, but I am very new to bioinformatics. I am trying to annotate a novel fungal genome. I just ran it through GeneMark-ES to annotate it and got a .gtf output. When I put this file into Geneious it attaches to my FASTA file and shows me all of the introns and exons on the sequences. My question is, how do I go from this to getting functional annotations? I downloaded Blast2Go and it looks like it only needs the FASTA file to run. If this is true, then why did I need to generate the .gtf file? How can it give me functional annotations without it? Thank you so much in advance!

genome • 1.1k views
ADD COMMENT
1
Entering edit mode
4.1 years ago

you might be mixing up a few (genome annotation) concepts here.

the fasta files hold the actual sequences of either your genome (== your initial input fasta file for genemark for instance) and on the other hand CDS, proteins, .... and all other possible sequences.

The annotation process of GeneMark is there to provide you with a location of genes on your genomic sequence. These are typically provided in formats such as GFF, EMBl, GTF ... . These only (or should in theory) contain coordinates of features how to are present on your genomic sequence. (== also called structural annotation)

By combining both of them you can extract from your genomic sequence the actual sequences of genes/CDS which then can be translated into proteins.

Tools for functional annotation (such as Blast2GO) use the protein sequences you predicted to analyse and assign potential functions to them (== functional annotation)

ADD COMMENT
0
Entering edit mode

Thank you for your response! I see what you mean about the combination of the fasta and the .gtf, as there are now automatic translations on Geneious under my sequences. What I'm still confused about it what file I put into Blast2Go, because it is just asking for a fasta file of actual sequences. Where would the protein sequences come into play if it just needs a fasta file to run?

ADD REPLY
0
Entering edit mode

blast2GO needs a fasta file with protein sequence in it.

fasta files can contain any kind of sequence, not only DNA sequences. So you need to generate a fasta file with the protein translations of your predicted genes and put those through blast2GO .

ADD REPLY
0
Entering edit mode

Thank you very much! I understand now. I have been researching a simple way to just get a fasta of the protein translations from a genomic sequence fasta and a .gtf file. Do you have any suggestions?

ADD REPLY
0
Entering edit mode

what did you found so far?

bedtools getfasta must have passed the search results, no?

ADD REPLY
1
Entering edit mode

I actually found a way to just download the translated sequences from Geneious, which was really easy. Thank you for all of your help. I really appreciate it!

ADD REPLY

Login before adding your answer.

Traffic: 2008 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6