GENE ONTOLOGY FOR TOMATO-ITAG4.1 OR 4
2
1
Entering edit mode
2.1 years ago
anusha ▴ 10

Hi all, I was doing Gene ontology enrichment analysis and realised the software i was using , had a different build of genome version like SL 3.0 instead of SL 4.0. By using this so many genes are missing the annotation. Iam not able to find any GO annotation file for itag 4.1 or SL 4.0 build. Can you all let me know how to generate that file or any software tool which uses new version of genome in tomato to do GO enrichment analysis.

BUILD GO ITAG4.1 SL4.0 • 2.0k views
ADD COMMENT
1
Entering edit mode
22 months ago
ar14g12 ▴ 10

Hi, is this what you're looking for: https://solgenomics.net/ftp/tomato_genome/annotation/ITAG4.1_release/

ADD COMMENT
0
Entering edit mode
14 months ago
Tobias • 0

Hey there!

So I was absolutely having the same issue as you. The assembly/annotation version difference across different tools is horrendous when working with Tomato.

You can perform GO enrichment using a custom GO annotation file using the enricher() function from the clusterProfiler package. It takes in TERM2GENE and TERM2NAME objects in lieu of an annotation database. The TERM2GENE and TERM2NAME objects should each be in the format of a dataframe with 2 columns, GOterms in column 1 and either a GO description (for TERM2NAME) or a gene/locusID (for TERM2GENE) in column 2.

I have taken the liberty of writing a simple function that will perform the analysis, that takes in a dataframe with the required GO mappings (provided below):

https://github.com/Tobias-deWerk/GOenrichment/blob/main/gene-to-GO-ALL.csv

library(clusterProfiler)

GOEnrichment <- function(genes_of_interest, universe, gene_to_GO, ontology = 'BP'){

     if (ontology == 'ALL') {

          TERM2GENE = gene_to_GO[,c('GOterm', 'LocusID')]
          TERM2GENE$LocusID = substr(TERM2GENE$LocusID, start = 1, stop = 16)
          TERM2NAME = gene_to_GO[,c('GOterm', 'GOdesc')]

     } else {

          TERM2GENE = gene_to_GO[gene_to_GO$Ontology == ontology, c('GOterm', 'LocusID')]
          TERM2GENE$LocusID = substr(TERM2GENE$LocusID, start = 1, stop = 16)
          TERM2NAME = gene_to_GO[gene_to_GO$Ontology == ontology, c('GOterm', 'GOdesc')]

     }

     results <- clusterProfiler::enricher(gene = genes_of_interest,
                                       universe = universe,
                                       TERM2GENE = TERM2GENE,
                                       TERM2NAME = TERM2NAME,
                                       pAdjustMethod = 'fdr',
                                       pvalueCutoff = 0.1,
                                       minGSSize = 7,
                                       maxGSSize = 500)

     return(results)
}

The results can be obtained by calling the function:

results <- GOEnrichment(genes_of_interest = "A vector of your genes of interest",
                    universe = "A vector containing all genes in your data",
                    gene_to_GO = "The gene to GO conversion mapping provided below",
                    ontology = "Biological process (BP) / Molecular Function (MF) / Cellular Component (CC) / All (ALL)")

The results object can be treated as normal in clusterProfiler, e.g. by calling the results:

> results

or making a dotplot

dotplot(results)
ADD COMMENT
0
Entering edit mode

Hello Tobias. Thank you very much for your help. Just Two questions:

  • How did you generate that table?
  • Is it possible to generate a KEGG table?
ADD REPLY

Login before adding your answer.

Traffic: 1849 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6