Question: TopGO: how to select specific genes from the bacgkround for the analysis
gravatar for Oli
3 months ago by
Oli0 wrote:

Hello community,

I need some help to make TopGO do what i need. I have intersected the differentially expressed genes in my RNA-seq experiment with a list of genes I'm interesed to for other reasons. When i create the TopGO object i have some troubles with the selection function. For the whole list of differentially expressed genes i select only those with a p-value < 0.05. What if i want to perform the enrichment analysis on the genes from the second list using the DGE from my experiment as the background?

My code is this:

background <- geneuniverse$PValue
names(background) <- geneuniverse$ensembl_gene_id
selection <- function(allScore){ return(allScore < 0.05)}
GoData <- new("topGOdata",
              ontology = "BP",
              allGenes = background,
              geneSel = selection,
              annot =,
              mapping = "", 
              ID = "ensembl",
              nodeSize = 3)

If i read my genes of interest in a vector by doing:

interesting_genes <- scan('interesting_genes.txt', header=TRUE)

How can I get TRUE/FALSE for matches between interesting genes and background? I have already tried match, %in% and grepl, but geneSel can only be a function. Any help?

I dont' know if it's of any help, but heres how the data look:

ENSG00000183242 ENSG00000248347 ENSG00000172482 ENSG00000120068 ENSG00000137558 
   1.81e-15        1.15e-08        5.59e-09        5.33e-07        4.86e-06      
[1] "ENSG00000114779" "ENSG00000143322" "ENSG00000123130" "ENSG00000103740" "ENSG00000164398"
geneontology rna-seq topgo R • 226 views
ADD COMMENTlink modified 12 weeks ago by e.rempel1000 • written 3 months ago by Oli0
gravatar for e.rempel
12 weeks ago by
Germany, Heidelberg
e.rempel1000 wrote:

Hi Oli,

you could specify allGenes as a factor (length equals number of significant genes in the background). This factor equals 1 if the gene is also in * interesting_genes* and 0 otherwise.

background_genes = names(background)[background <= 0.05]
int.genes <- factor(as.integer(background_genes %in% interesting_genes))
names(int.genes) = background_genes
GoData <- new("topGOdata",
          ontology = "BP",
          allGenes = int.genes,
          annot =,
          mapping = "", 
          ID = "ensembl",
          nodeSize = 3)

So you are just limiting universe of genes to background_genes.

ADD COMMENTlink written 12 weeks ago by e.rempel1000
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1193 users visited in the last hour