Question: Is there a way to get the name or IDs of genes with the goseq table?
0
gravatar for unawaz
9 months ago by
unawaz50
Australia
unawaz50 wrote:

Hi all,

I've performed a GO enrichment analysis using goseq. The output of goseq() tells you the enriched categories with how many significant genes are in the categories etc, but it doesn't give you the ID of genes that are in any of those categories.

Obviously you can use getgo() to retrieve GOs for genes of interest, but is there a way to get a column of the genes in each of the enriched GO terms using the goseq() function.

So my desired output would be:

category over_represented_pvalue under_represented_pvalue numDEInCat numInCat    term ontology  Ensembl ID
GO:0000786            2.143408e-16                        1         12       43       nucleosome       CC. ENSG00000112655, ENSG00000158483  etc..

With the "Ensembl ID" column is the one I'm looking to add

Any help would be greatly appreciated!

rna-seq goseq gene ontology • 801 views
ADD COMMENTlink modified 9 months ago by magicpants60 • written 9 months ago by unawaz50
2
gravatar for magicpants
9 months ago by
magicpants60
magicpants60 wrote:

Yes please refer to this post: https://support.bioconductor.org/p/102273/

For my own case, I modified the function

# Get the gene lists of "numDFinCat" in GO.wall report
getGeneLists <- function(pwf, goterms, genome, ids){
  gene2cat <- getgo(rownames(pwf), genome, ids)
  cat2gene <- split(rep(names(gene2cat), sapply(gene2cat, length)),
                    unlist(gene2cat, use.names = FALSE))
  out <- list()
  for(term in goterms){
    tmp <- pwf[cat2gene[[term]],]
    tmp <- rownames(tmp[tmp$DEgenes > 0, ])
    out[[term]] <- tmp
  }
  out
}

This can get a list containing GO terms in my GO.wall report with their associated Ensembl IDs:

goterms <- GO.wall$category 
goList <- getGeneLists(pwf, goterms, "hg19", "ensGene")

> head(goList, 1)
$`GO:0140014`
 [1] "ENSG00000040275" "ENSG00000117724" "ENSG00000198901" "ENSG00000156970"
 [5] "ENSG00000186185" "ENSG00000143228" "ENSG00000090889" "ENSG00000125538"
 [9] "ENSG00000158402" "ENSG00000175063" "ENSG00000121152" "ENSG00000169679"

If you want to add the additional column in your report, you can just

 GO.wall$EnsemblID <- sapply(GO.wall$category, function(x) goList[[x]])

The class of elements in GO.wall$EnsemblID will be list, but you can convert into any other types when you use them.

ADD COMMENTlink modified 9 months ago • written 9 months ago by magicpants60

Exactly what I needed! Thank you!

ADD REPLYlink written 9 months ago by unawaz50
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1781 users visited in the last hour