Question

Difficulty Converting lncRNA gene symbol to Ensembl

0

Entering edit mode

5.8 years ago

JourneyToAbyss ▴ 240

I have many lncRNA gene symbols, which I am trying to convert to Ensembl ID. Much of the difficulty stems from not knowing where these gene symbols come from. Thus, my efforts using bioDBnet and biomaRt in R have been unfruitful. Yet, if I use the ensembl website, it is able to identify these gene symbols.

Example:

RP11-23J18.1
RP4-616B8.4
CTD-3184A7.4
ILF3-AS1
RP11-410L14.2

Thus, I was hoping someone could point me in the correct direction to understand where/how these gene symbols originated and the best practice to convert them to ensembl IDs.

Thank you!

gene symbol ensembl agilent microarray • 2.2k views

ADD COMMENT • link updated 5.8 years ago by GouthamAtla 12k • written 5.8 years ago by JourneyToAbyss ▴ 240

2

Entering edit mode

What code have you tried so far? I have also just tried to map these using biomaRt but could not do it. The genes that you list are predicted ncRNAs, some pseudogenes. They may not even be included in the databases accessed by biomaRt. They do have assigned Ensembl IDs, though. If you simply search for them in UCSC, for example, you can access information.

Their IDs are also accessible from GENCODE's 'comprehensive' annotation:

grep -e "RP11-23J18.1" gencode.v28.annotation.gff3

chr12   HAVANA  gene    47135498    47138370    .   -   .   ID=ENSG00000258352.1;gene_id=ENSG00000258352.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;level=1;tag=pseudo_consens;havana_gene=OTTHUMG00000169618.2
chr12   HAVANA  transcript  47135498    47135901    .   -   .   ID=ENST00000553227.1;Parent=ENSG00000258352.1;gene_id=ENSG00000258352.1;transcript_id=ENST00000553227.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;transcript_type=transcribed_processed_pseudogene;transcript_name=RP11-23J18.1-001;level=1;transcript_support_level=NA;ont=PGO:0000004,PGO:0000019;tag=pseudo_consens,basic;havana_gene=OTTHUMG00000169618.2;havana_transcript=OTTHUMT00000405085.1
chr12   HAVANA  exon    47135498    47135901    .   -   .   ID=exon:ENST00000553227.1:1;Parent=ENST00000553227.1;gene_id=ENSG00000258352.1;transcript_id=ENST00000553227.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;transcript_type=transcribed_processed_pseudogene;transcript_name=RP11-23J18.1-001;exon_number=1;exon_id=ENSE00002393310.1;level=1;transcript_support_level=NA;ont=PGO:0000004,PGO:0000019;tag=pseudo_consens,basic;havana_gene=OTTHUMG00000169618.2;havana_transcript=OTTHUMT00000405085.1
chr12   HAVANA  transcript  47135762    47138370    .   -   .   ID=ENST00000548463.1;Parent=ENSG00000258352.1;gene_id=ENSG00000258352.1;transcript_id=ENST00000548463.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;transcript_type=processed_transcript;transcript_name=RP11-23J18.1-002;level=2;transcript_support_level=2;tag=basic;havana_gene=OTTHUMG00000169618.2;havana_transcript=OTTHUMT00000405330.1
chr12   HAVANA  exon    47138216    47138370    .   -   .   ID=exon:ENST00000548463.1:1;Parent=ENST00000548463.1;gene_id=ENSG00000258352.1;transcript_id=ENST00000548463.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;transcript_type=processed_transcript;transcript_name=RP11-23J18.1-002;exon_number=1;exon_id=ENSE00002328612.1;level=2;transcript_support_level=2;tag=basic;havana_gene=OTTHUMG00000169618.2;havana_transcript=OTTHUMT00000405330.1
chr12   HAVANA  exon    47135762    47135945    .   -   .   ID=exon:ENST00000548463.1:2;Parent=ENST00000548463.1;gene_id=ENSG00000258352.1;transcript_id=ENST00000548463.1;gene_type=transcribed_processed_pseudogene;gene_name=RP11-23J18.1;transcript_type=processed_transcript;transcript_name=RP11-23J18.1-002;exon_number=2;exon_id=ENSE00002410022.1;level=2;transcript_support_level=2;tag=basic;havana_gene=OTTHUMG00000169618.2;havana_transcript=OTTHUMT00000405330.1

A logical question: do you even need these genes for your downstream work? Information on these types of genes is scant / non-existent; so, even if you convert them to Ensembl IDs, you'll likely have to later exclude them at your next step. Depends on what you want to do with them.

ADD REPLY • link 5.8 years ago by Kevin Blighe 88k

1

Entering edit mode

The following website might help https://genealacart.genecards.org/Query You need create an account. In addition directly going to genecards website and typing the RNA in query is also a way around if you have a very limited set of genes

ADD REPLY • link 5.8 years ago by noorpratap.singh ▴ 330