Question: miRNA corresponds to multiple ensembl gene IDs
0
gravatar for alcs417
24 months ago by
alcs41770
alcs41770 wrote:

Hi there,

I'd like to know why a microRNA could correspond to multiple ensembl gene IDs? For examples, we can see "hsa-mir-1302-2" corresponds to 4 ensembl gene IDs:

ensembl_gene_id gene_biotype ensembl_transcript_id mirbase_id

ENSG00000284332 miRNA ENST00000607096 hsa-mir-1302-2

ENSG00000284557 miRNA ENST00000408734 hsa-mir-1302-2

ENSG00000283921 miRNA ENST00000408365 hsa-mir-1302-2

ENSG00000283801 miRNA ENST00000408051 hsa-mir-1302-2

The GTF file is downloaded from: ftp://ftp.ensembl.org/pub/release-90/gtf/homo_sapiens/

Could any one please tell me why? I want to get the sequence length for each miRNA and it seems that I should not add up the length of all the transcripts together.

mirna ensembl mirbase • 908 views
ADD COMMENTlink modified 24 months ago by Emily_Ensembl19k • written 24 months ago by alcs41770
3
gravatar for VHahaut
24 months ago by
VHahaut1.1k
Belgium
VHahaut1.1k wrote:

There are all paralogues:

http://www.ensembl.org/Homo_sapiens/Gene/Compara_Paralog?db=core;g=ENSG00000284332;r=1:30366-30503;t=ENST00000607096

ADD COMMENTlink modified 24 months ago • written 24 months ago by VHahaut1.1k

Thanks for your help! Really appreciated!

ADD REPLYlink written 24 months ago by alcs41770
3
gravatar for Emily_Ensembl
24 months ago by
Emily_Ensembl19k
EMBL-EBI
Emily_Ensembl19k wrote:

miRNAs tend to be replicated over the genome. miRbase work per sequence, so will have one miRNA ID, whereas in Ensembl we work in terms of the genome, which means if one sequence occurs multiple times in the genome, each will be assigned its own Ensembl ID.

ADD COMMENTlink written 24 months ago by Emily_Ensembl19k

Really thanks for you help. I am wondering if there is any attribute that can tell the miRNA I required with those paralogs apart when using the getBM() function from the R package biomaRt? To be more precise, I found that for each of the four miRNAs: hsa-mir-1302-2, hsa-mir-1302-9, hsa-mir-1302-10, hsa-mir-1302-11, the getBM() function will return the same information including all the four miRNAs. That seems to be quite tedious. The R code is as follows:

mart <- useMart(biomart="ensembl", dataset="hsapiens_gene_ensembl")

miRNA_res <- getBM(attributes=c("ensembl_gene_id", "gene_biotype", "ensembl_transcript_id", "mirbase_id"), mart=mart)

The following four commands will get the same results :

miRNA_res[miRNA_res$mirbase_id=="hsa-mir-1302-2",] miRNA_res[miRNA_res$mirbase_id=="hsa-mir-1302-9",] miRNA_res[miRNA_res$mirbase_id=="hsa-mir-1302-10",] miRNA_res[miRNA_res$mirbase_id=="hsa-mir-1302-11",]

Any hint would be really appreciated.

ADD REPLYlink modified 24 months ago • written 24 months ago by alcs41770
1

The gene name for miRNAs will be from the miRbase IDs, so if you filter by gene name with, say, MIR1302-2, you'll get just that one.

ADD REPLYlink written 24 months ago by Emily_Ensembl19k

Thanks for the hint! It perferctly solved my problem!

ADD REPLYlink written 24 months ago by alcs41770
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1067 users visited in the last hour