1 feasible gene out of 33,602 available genes in topGOdata object
1
0
Entering edit mode
17 months ago
liyong ▴ 80

Hello all,

I am trying to use topGO in R to do GO analysis. For genelist, I collect all Arabidopsis genes from (https://www.arabidopsis.org/download_files/Genes/TAIR10_genome_release/TAIR10_gene_lists/TAIR10_representative_gene_models) to define the gene universe.

Then I run GOdata = new("topGOdata", ontology = "BP", allGenes = genelist, annot = annFUN.org, mapping = "org.At.tair.db"), it gives me an error Nothing to do: Error in split.default(names(sort(nl)), f.index) : first argument must be a vector. After I google around, I found this thread (https://support.bioconductor.org/p/132621/). Following the suggestion, I re-run GOdata = new("topGOdata", ontology = "BP", allGenes = genelist, annot = annFUN.org, mapping = "org.At.tair.db", ID = "symbol" by adding ID = "symbol", which looks good.

However, when I check the GOdata, it shows :enter image description here

So there is ONLY 1 feasible gene out of 33602 available genes, which is quite weird.

After going through the tutorial (https://www.bioconductor.org/packages/devel/bioc/vignettes/topGO/inst/doc/topGO.pdf), I found that annFUN.org function is using the mappings from the "org.XX.XX" annotation packages. Currently, the function supports the following gene identifiers: Entrez, GenBank, Alias, Ensembl, Gene Symbol, GeneName and UniGene..

So I try these different gene identifiers (e.g with "Ensembl" GOdata = new("topGOdata", ontology = "BP", allGenes = genelist, annot = annFUN.org, mapping = "org.At.tair.db", ID = "Ensembl"). However, I got another error Building most specific GOs ..... Error: no such table: ensembl.

Can anyone help me troubleshoot this, any suggestions will be appreciated.

Thanks a lot!

topGO GO • 509 views
ADD COMMENT
1
Entering edit mode
17 months ago
liyong ▴ 80

The problem turns out to be the format of my gene IDs.

The gene ID format I used contains transcript numbers (e.g. "AT4G37770.1" or similar), which should be called transcript ID I think. After I remove the transcript numbers (e.g. change "AT4G37770.1" to "AT4G37770"). It looks all good for now.

ADD COMMENT

Login before adding your answer.

Traffic: 3211 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6