Question: Annotating genes with a function
gravatar for nash.claire
2.5 years ago by
nash.claire280 wrote:

Hi all,

I have a question regarding gene ontology. I have ChIP-seq data which I have compared to RNA-seq expression data which has given me a list of candidate genes that I am interested in pursuing. What I'd like to do next is to annotate each of these genes in the list with a molecular function. For example, I'd like to achieve something like this:-

Gene                       Top Function

GeneA                     Transcription factor

GeneB                     Secreted protein

I've done a lot of searching and there are almost too many tools out there for gene ontology that it's hard to know which one to choose. They all seem to take a list of genes and group them into most represented function however this is not really what I'm after. I want an annotation for each gene in the list if possible. I also find that a lot of the tools give terms that are not useful such as "binding" (I mean what does that even mean!!!). I don't know if DAVID can do this sort of thing but I find it not so user friendly and from what I can gather it is also quite out of date now.

Anyone have any ideas??

rna-seq chip-seq • 866 views
ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by nash.claire280

Very helpful description thank you Sukhdeep. Just to update though, I found a way to sort of get what I'm after with Biomart. Using Biomart I can select fields such as Gene Type, GO term name/accession etc for each gene in my list. I'm going to start with this and try other tools as you say to try and get a feel for the data. Essentially what I want to do is pull out all the transcription factors and all the secreted proteins from my list for further analysis. If there is a simple way of doing that I'm all ears!

ADD REPLYlink written 2.5 years ago by nash.claire280
gravatar for Sukhdeep Singh
2.5 years ago by
Sukhdeep Singh9.3k
Sukhdeep Singh9.3k wrote:

GO analysis is a very evolving field which is not self-sufficient meaning it depends on the inputs from different experiments and how they annotate a gene and the corresponding attributes. Earlier annotating the function of a gene, had no proper rules and that why you see a gene associated with multiple terms of which some are very ambiguous. Lot of tools over the time have tried to come up with different solutions, you can in-fact check the questions regarding GO analysis in the biostars itself, to get a flavour of that.

So, to answer your question, there is no straightforward method to perfom a GO analysis and results from various tools vary and often can lead to different interpretations. So, I would recommend using multiple engines to get a flovour of what you are after and you can remove/hide some child or parent terms which are redundant manually. You could also use a tool like David or Panther and export the list to Revigo which is really nice in summarizing, exploring and hiding child terms under the parent categories. Another good way of sorting the GO terms is via the LOD score, which relatively prioritizes the specific terms than the very common ones.

In R, you can also look at ClusterProfiler's function ego, which works out the over-represented terms and can work out as well.

Other pointers: 
Best Way To Do Pathway Analysis Of A Set Of Genes?

Hiding/Merging Child Annotations Terms Under The Parents [Gene Ontology]

Tools To Find Gene Ontology Term Enrichment

Pointers To Learn About Functional Enrichment And Go Analysis

Gene Ontology Categories

Simple Go Analysis For Gene Expression Microarrays

Go Analysis Web Services?

ADD COMMENTlink modified 2.5 years ago • written 2.5 years ago by Sukhdeep Singh9.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 705 users visited in the last hour