Question: Finding Go Terms For Viral Annotated Genome
gravatar for Haji
10.6 years ago by
European Union
Haji50 wrote:

Hello all. I'm trying to find the GO terms for a list of Viral gene names, originating from different genomes. I cannot think about an easy, straightforward way to do so... Is it possible that the only way to do this search is to blast the gene sequences, and then to re-assemble the list? Thanks for your attention H.

gene • 1.6k views
ADD COMMENTlink modified 7.2 years ago by pld4.9k • written 10.6 years ago by Haji50
gravatar for Pierre Lindenbaum
10.6 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum133k wrote:

If those names are some valid Accession numbers, See my previous solution using XSLT + NCBI/Gene

ADD COMMENTlink modified 16 months ago by _r_am32k • written 10.6 years ago by Pierre Lindenbaum133k
gravatar for Khader Shameer
10.6 years ago by
Manhattan, NY
Khader Shameer18k wrote:

As suggested by Pierre, if you have the accession numbers, you could retrieve GO annotations. If you don't have GO annotations available for your genes, you could use sequence based tools for indirect GO associations. You could try tools annotation tools provided here.

A related question is here.

ADD COMMENTlink modified 16 months ago by _r_am32k • written 10.6 years ago by Khader Shameer18k
gravatar for pld
7.2 years ago by
United States
pld4.9k wrote:

As far as I know there are not any GO term annotations for viral genomes. Some viruses such as Orthopoxviridae will have many host gene homologs meaning you can easily use BLAST to assign GO terms. Genes that lack any annotated homologs will be left unannotated. A word of warning is to not confuse highly homologus domains with homologs, you'll want to ensure that there is a sufficiently high level of identity with the length of the gene. A nearly identical CDS is one thing, but inferring function from a small but conserved domain is asking for trouble. Also use tblastn, since it will compare at the amino acid sequence level. Typically codon usage bias (CUB) will be very different between the virus and the reference species, enough so that you can easily miss things.

Even then you have to remember that even when viral genes are highly homologous to host genes, it is very hard to say that they have the same function. They might interact with the same things, but the temporal and spatial features of those interactions are totally different between the virus and host homologs. The viral gene could have totally different localization and therefore would be open to interact with things the host gene would never see.

I'm curious to know what your intended goal is here. If you are simply interested in functions of the viral genes in terms of viral natrual history, I would just grind the literature and make your own annotations. Within the virus (i.e. not interacting with host genes), there will be a very limited and concise function of each gene. If you're looking to include what host genes do, I would suggest you compile a list of host genes it interacts with. From there you could place a sign on the interaction (activates/upregulates/etc vs disables/deactivates/etc). For example CPXV prevents trafficing of MHC-1, so a negative sign (disables) on the interaction. MHC-1 presents antigens. So CPXV203 is involved in the disabling of MHC-1.

There really isn't a simple way to do this, aside the few obvious roles that genes play in the virus life cycle, everything will be defined in terms of what host components it interacts with and how it interacts with them. I would argue that GO via BLAST isn't the best way to provide functional annotations for viral genes.

ADD COMMENTlink modified 2.4 years ago by _r_am32k • written 7.2 years ago by pld4.9k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2289 users visited in the last hour