Converting lincRNA Ensembl id to gene symbol
1
0
Entering edit mode
4.0 years ago
ek699 ▴ 10

Hi, I am trying to convert Ensembl id of lincRNA to its symbol (gene name) by using biomaRt in R, but the dataset doesn't give me a proper annotation. I know that Ensembl has the annotation for this lincRNA, since I can see it on IGV. For your information, I used GRCh38. p13 annotation from GENCODE when I mapped the reads.

I simplified my code as much as possible just to see if the biomaRt provides me a proper annotation for the lincRNA gene, but it gives me NA.

library(biomaRt)
fix_name <- as.character("lincRNA_id")
mart <- useDataset("hsapiens_gene_ensembl", useMart("ensembl"))
symbol <- getBM(filters = "ensembl_gene_id",
                attributes = c("ensembl_gene_id","hgnc_symbol"),
                values = fix_name, 
                mart = mart)

 >symbol
  ensembl_gene_id hgnc_symbol
1 lincRNA_id          NA

Could you please let me know how I can get the gene name from the id? I am new to bioinformatics, so any help is much appreciated. Thank you

RNA-Seq lincRNA Ensembl Gencode GRCh38.p13 • 1.1k views
ADD COMMENT
0
Entering edit mode
4.0 years ago
ATpoint 82k

It is not uncommon that some genes do not have a HGNC symbol, especially non-coding genes.

symbol <- getBM(attributes = c("ensembl_gene_id","hgnc_symbol"),
                             mart = useMart("ensembl", dataset="hsapiens_gene_ensembl"))

## Almost 25000 Ensembl genes have no HGNC symbol
sum(symbol$ensembl_gene_id != "") ## 67159
sum(symbol$hgnc_symbol != "") ## 42848

enter image description here

Code for that plot:

library(biomaRt)
library(ggplot2)

symbol <- getBM(attributes = c("ensembl_gene_id","hgnc_symbol", "gene_biotype"),
                mart = useMart("ensembl", dataset="hsapiens_gene_ensembl"))

## Almost 25000 Ensembl genes have no HGNC symbol
sum(symbol$ensembl_gene_id != "") ## 67159
sum(symbol$hgnc_symbol != "") ## 42848

df <- data.frame(table(symbol[symbol$ensembl_gene_id != "" & 
                              symbol$hgnc_symbol == "",]$gene_biotype)) %>% arrange(Freq)
colnames(df) <- c("Gene_Biotype", "Number")

df$Gene_Biotype <- factor(df$Gene_Biotype, levels = df$Gene_Biotype)

ggplot(data=df, aes(x=Gene_Biotype, y=Number)) +
  geom_bar(stat="identity") + 
  coord_flip() + 
  ggtitle("Genes without HGNC symbol")
ADD COMMENT
0
Entering edit mode

Thanks for your comment. Is there any way that I can get gene name from this lincRNA gene id?

ADD REPLY
0
Entering edit mode

Did you read my answer? If your gene has no HGNC symbol, then no, you can't get HGNC name for it.

ADD REPLY

Login before adding your answer.

Traffic: 3428 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6