Error in getBM: accessing Ensembl annotation with biomaRt
2
0
Entering edit mode
3.2 years ago
celia.escher ▴ 20

Hello!

I am trying to get the gene names and additional features after DESeq2 of human RNA-seq data where I contrast 2 diseases with healthy controls. However, I am stuck with getBM (I am following a tutorial from some years ago and do not know if it is not too updated either...). This is my code:

dds <- DESeq(dds)
res <- results(dds)
res <- results(dds, contrast = c("disease", "LC", "Hc"))

res$ensembl <- sapply(strsplit(rownames(res), split="\\+" ), "[", 1 )
ensembl <- useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl")
genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene_id", "hgnc_symbol", "chromosome_name"),
                  #filters = "ensembl_gene_id",   # with filters it does not work
                  values = res$ensembl,
                  mart = ensembl)
idx <- match(res$ensembl, genemap$ensembl_gene_id)
res$entrez <- genemap$entrezgene_id[idx]
res$gene_name <- genemap$hgnc_symbol[idx]
res$chr <- genemap$chromosome_name[idx]

write.csv( as.data.frame(res), file="results.csv" )

The tutorial recommends this part:

First, we split up the rownames of the results object, which contain ENSEMBL gene ids, separated by the plus sign, +. The following code then takes the first id for each gene by invoking the open square bracket function "[" and the argument, 1.

res$ensembl <- sapply( strsplit( rownames(res), split="\+" ), "[", 1 )

But I see that the ENSEMBL names are ENSG00000281764.1, ENSG00000281299.1, and so on...??? I have also tried to change that part for res$ensembl <- rownames(res) but no improvement...

Thank you so much for your comments!!!

RNA-Seq genome R gene Assembly • 712 views
ADD COMMENT
1
Entering edit mode
3.2 years ago
loughrae ▴ 90

The .1 etc are version numbers for the Ensembl IDs. Assuming your problem is that the code can’t match plain Ensembl IDs from res with versioned ones, you can remove the versions from genemap$ensembl_gene_id using gsub() and then they should match. You can find a regex to do that here: https://stackoverflow.com/questions/10617702/remove-part-of-string-after

ADD COMMENT
1
Entering edit mode
3.2 years ago
celia.escher ▴ 20
  res$ensembl <- gsub("\\..*","", res$ensembl)

Perfectly worked!!! Many thanks! (instead of sapply)

ADD COMMENT

Login before adding your answer.

Traffic: 3133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6