I am trying to get the gene names and additional features after DESeq2 of human RNA-seq data where I contrast 2 diseases with healthy controls. However, I am stuck with getBM (I am following a tutorial from some years ago and do not know if it is not too updated either...). This is my code:
dds <- DESeq(dds) res <- results(dds) res <- results(dds, contrast = c("disease", "LC", "Hc")) res$ensembl <- sapply(strsplit(rownames(res), split="\\+" ), "[", 1 ) ensembl <- useMart(biomart = "ensembl", dataset = "hsapiens_gene_ensembl") genemap <- getBM( attributes = c("ensembl_gene_id", "entrezgene_id", "hgnc_symbol", "chromosome_name"), #filters = "ensembl_gene_id", # with filters it does not work values = res$ensembl, mart = ensembl) idx <- match(res$ensembl, genemap$ensembl_gene_id) res$entrez <- genemap$entrezgene_id[idx] res$gene_name <- genemap$hgnc_symbol[idx] res$chr <- genemap$chromosome_name[idx] write.csv( as.data.frame(res), file="results.csv" )
The tutorial recommends this part:
First, we split up the rownames of the results object, which contain ENSEMBL gene ids, separated by the plus sign, +. The following code then takes the first id for each gene by invoking the open square bracket function "[" and the argument, 1.
res$ensembl <- sapply( strsplit( rownames(res), split="\+" ), "[", 1 )
But I see that the ENSEMBL names are ENSG00000281764.1, ENSG00000281299.1, and so on...???
I have also tried to change that part for
res$ensembl <- rownames(res) but no improvement...
Thank you so much for your comments!!!