Question: Deseq2 gene name annotation
0
gravatar for BM
3.5 years ago by
BM40
United Kingdom
BM40 wrote:

I am trying to annotate the Ensemble ID in the Deseq2 results file and add a column of Gene symbols and gene names. I have tried to use Biomart and also AnnotationDb/org.Mm.eg.db.

This is the ouput

library(DESeq2)

# Loading required package: Rcpp
# Loading required package: RcppArmadillo

counts = read.delim("3mTA2.txt", header=T, row.names=1)
sample <- read.delim("~/sample.txt")
count.data.set <- DESeqDataSetFromMatrix(countData=counts, colData=sample,design= ~ genotype)
dds<-DESeq(count.data.set)
res <- results(dds)

library("AnnotationDbi")
library("org.Mm.eg.db")

columns (org.Mm.eg.db)
# [1] "ENTREZID"     "PFAM"         "IPI"          "PROSITE"      "ACCNUM"   
# [6] "ALIAS"        "CHR"          "CHRLOC"       "CHRLOCEND"    "ENZYME"   
# [11] "PATH"         "PMID"         "REFSEQ"       "SYMBOL"       "UNIGENE"  
# [16] "ENSEMBL"      "ENSEMBLPROT"  "ENSEMBLTRANS" "GENENAME"     "UNIPROT"  
# [21] "GO"           "EVIDENCE"     "ONTOLOGY"     "GOALL"        "EVIDENCEALL"
# [26] "ONTOLOGYALL"  "MGI"      

res$hgnc_symbol <- convertIDs(row.names(res), "ENSEMBL", "SYMBOL", org.Mm.eg.db)

# Error: could not find function "convertIDs"

convertIDs <- function( ids, from, to, db, ifMultiple=c("putNA", "useFirst")) {
  stopifnot( inherits( db, "AnnotationDb" ) )
  ifMultiple <- match.arg( ifMultiple )
  suppressWarnings( res <- AnnotationDbi::select(
    db, keys=ids, keytype=from, columns=c(from,to) ) )
  if ( ifMultiple == "putNA" ) {
    duplicatedIds <- res[ duplicated( selRes[,1] ), 1 ]
    res <- res[ ! res[,1] %in% duplicatedIds, ]

  }

  return(res[ match( ids, selRes[,1] ), 2 ] )}

res$hgnc_symbol <- convertIDs(row.names(res), "ENSEMBL", "SYMBOL", org.Mm.eg.db) Error in .testForValidKeys(x, keys, keytype) : None of the keys entered are valid keys for 'ENSEMBL'. Please use the keys method to see a listing of valid arguments. Called from: .testForValidKeys(x, keys, keytype)
#Browse[1]

library( "biomaRt" )

ensembl = useMart( "ensembl", dataset = "mmusculus_gene_ensembl" )
res$ensembl <- sapply( strsplit( rownames(res), split="nn+" ), "[", 1 )
genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene", "hgnc_symbol"), 
                  filters = "ensembl_gene_id",
                  values = res$ensembl
                  genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene", "hgnc_symbol"), 
                                    filters = "ensembl_gene_id",
                                    values = res$ensembl, 
                                    mart = ensembl )
ADD COMMENTlink modified 3 months ago by zx87546.2k • written 3.5 years ago by BM40
0
gravatar for Michael Love
3.5 years ago by
Michael Love1.7k
United States
Michael Love1.7k wrote:

In the latest release, the GenomicFeatures package authors added mapIds() which is straightforward to use.

See ?mapIds after loading the GenomicFeatures package.

ADD COMMENTlink modified 3.5 years ago • written 3.5 years ago by Michael Love1.7k
1

I updated the workflow to use mapIds()

http://bioconductor.org/help/workflows/rnaseqGene/#annotate

ADD REPLYlink written 3.5 years ago by Michael Love1.7k
0
gravatar for jimmy_zeng
2.7 years ago by
jimmy_zeng90
University of Macau
jimmy_zeng90 wrote:

From Ensemble ID to gene symbol and gene associated name , I don't think you need a function .

In fact,there's enough information in the R package "org.Mm.eg.db" , you will find many pre-defined dataset in this package by using ls('package:org.Mm.eg.db')

you can just use ToTable to get two tables toTable(org.Mm.egGENENAME) and toTable(org.Mm.egSYMBOL) , and then you can use merge function to connect this information as you need .

Hope this will help you .

ADD COMMENTlink written 2.7 years ago by jimmy_zeng90
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1305 users visited in the last hour