Question: Transcript biotypes for ncRNA in GRCh37 using biomaRt?
0
gravatar for Sergio Martínez Cuesta
12 months ago by
Cambridge, UK

Dear all,

I am attempting to retrieve transcript biotypes for ncRNAs using Bioconductors's biomaRt in GRCh37 as follows:

library(biomaRt)
ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org", dataset="hsapiens_gene_ensembl") 

# biotypes for mRNAs are obtained fine
refseqids_nm = c("NM_152486","NM_080605", "NM_031921")
getBM(attributes=c("refseq_mrna", "transcript_biotype"), filters="refseq_mrna", values=refseqids_nm, mart=ensembl)
#  refseq_mrna transcript_biotype
#1   NM_031921     protein_coding
#2   NM_080605     protein_coding
#3   NM_152486     protein_coding

# However not for ncRNAs
refseqids_nr = c("NR_015434", "NR_036637")
getBM(attributes=c("refseq_ncrna", "transcript_biotype"), filters="refseq_ncrna", values=refseqids_nr, mart=ensembl)
#[1] refseq_ncrna       transcript_biotype
#<0 rows> (or 0-length row.names)

When I try the same as above but with the current release of Ensembl:

ensembl <- useMart(biomart="ENSEMBL_MART_ENSEMBL", dataset="hsapiens_gene_ensembl")
getBM(attributes=c("refseq_ncrna", "transcript_biotype"), filters="refseq_ncrna", values=refseqids_nr, mart=ensembl)
#  refseq_ncrna   transcript_biotype
#1    NR_015434            antisense
#2    NR_036637 processed_transcript

Then I get biotypes for ncRNAs just fine.

Perhaps there is something I am missing here. Does GRCh37 have annotations for ncRNAs? If so, any input on how I can obtain transcript biotypes using biomaRt as above?

Thanks, Sergio

ADD COMMENTlink modified 12 months ago by Emily_Ensembl18k • written 12 months ago by Sergio Martínez Cuesta60
2
gravatar for Emily_Ensembl
12 months ago by
Emily_Ensembl18k
EMBL-EBI
Emily_Ensembl18k wrote:

The gene annotation on GRCh37 is older, so some recent. data may be missing that is in the main GRCh38 database. A quick search for those identifiers on the GRCh37 website shows that we do not have them mapped to any Ensembl transcripts on GRCh37. BioMart gets information that is linked to Ensembl transcripts, so only gets data for RefSeq transcripts if they are mapped to Ensembl transcripts.

ADD COMMENTlink written 12 months ago by Emily_Ensembl18k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1324 users visited in the last hour