Question: How to find all miRNA locations in on GRch37 ?
2
gravatar for Aurelie MLB
5.7 years ago by
Aurelie MLB320
United Kingdom
Aurelie MLB320 wrote:

Hello,

I am trying to get the location of miRNAs on the human genome GRch37 from UCSC.

I tried to use bioconductor to do this. According to documentation, if I understand well you do:

library("TxDb.Hsapiens.UCSC.hg19.knownGene")

library(mirbase.db)

microRNAs(TxDb.Hsapiens.UCSC.hg19.knownGene)

But I get an empty TranscriptDB...:

 

GRanges with 0 ranges and 1 metadata column:

seqnames ranges strand | mirna_id

<Rle> <IRanges> <Rle> | <character>

---

seqlengths:

chr1 chr2 ... chrUn_gl000249

249250621 243199373 ... 38502

 

I investigated and found that:

-for TxDb.Hsapiens.UCSC.hg19.knownGene, I have: "miRBase build ID: GRCh37"

-for mirbase.db: supportedMiRBaseBuildValues() gives me:"Homo sapiens GRCh37.p5"

I fear that there is an incompatibility here and I do not know what to do to make it work. Would you have any idea please?

If not, would you know where I could getthe locations for the microRNA on GRch37 please??

Many thanks!

 

 

 

mirna bioconductor genome • 4.4k views
ADD COMMENTlink modified 5.6 years ago by Pablo Marin-Garcia1.8k • written 5.7 years ago by Aurelie MLB320
9
gravatar for Pablo Marin-Garcia
5.6 years ago by
Spain
Pablo Marin-Garcia1.8k wrote:

You can take them from miRBASE. The current release is in GRCh38 but you have also the previous releases in GRCh37 both in gff2 and gff3 (gff3 contains also rows for mature regions)

ftp://mirbase.org/pub/mirbase/20/genomes/hsa.gff2

##gff-version 2
##date 2013-05-24
#
# Chromosomal coordinates of Homo sapiens microRNAs
# microRNAs                 miRBase v20
# genome-build-id           GRCh37.p5
# genome-build-accession    NCBI_Assembly:GCA_000001405.6
#
chr1 . miRNA 17369 17436 . - . ACC="MI0022705"; ID="hsa-mir-6859-1";
chr1 . miRNA 30366 30503 . + . ACC="MI0006363"; ID="hsa-mir-1302-2";
chr1 . miRNA 567705 567793 . - . ACC="MI0022558"; ID="hsa-mir-6723";
chr1 . miRNA 1102484 1102578 . + . ACC="MI0000342"; ID="hsa-mir-200b";

ftp://mirbase.org/pub/mirbase/20/genomes/hsa.gff3  # with both the gene and the mature coords (mimat id)

##gff-version 3
##date 2013-10-1
#
# Chromosomal coordinates of Homo sapiens microRNAs
# microRNAs:               miRBase v20
# genome-build-id:         GRCh37.p5
# genome-build-accession:  NCBI_Assembly:GCA_000001405.6
#
# Hairpin precursor sequences have type "miRNA_primary_transcript". 
# Note, these sequences do not represent the full primary transcript, 
# rather a predicted stem-loop portion that includes the precursor 
# miRNA. Mature sequences have type "miRNA".
#
chr1	.	miRNA_primary_transcript	17369	17436	.	-	.	ID=MI0022705;Alias=MI0022705;Name=hsa-mir-6859-1
chr1	.	miRNA	17409	17431	.	-	.	ID=MIMAT0027618;Alias=MIMAT0027618;Name=hsa-miR-6859-5p;Derives_from=MI0022705
chr1	.	miRNA	17369	17391	.	-	.	ID=MIMAT0027619;Alias=MIMAT0027619;Name=hsa-miR-6859-3p;Derives_from=MI0022705
chr1	.	miRNA_primary_transcript	30366	30503	.	+	.	ID=MI0006363;Alias=MI0006363;Name=hsa-mir-1302-2
chr1	.	miRNA	30438	30458	.	+	.	ID=MIMAT0005890;Alias=MIMAT0005890;Name=hsa-miR-1302;Derives_from=MI0006363

 

You can obtain them also programatically with biomart from the stable GRCh37 ensembl

XML query:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE Query>
<Query  virtualSchemaName = "default" formatter = "TSV" header = "0" uniqueRows = "0" count = "" datasetConfigVersion = "0.6" >
			
	<Dataset name = "hsapiens_gene_ensembl" interface = "default" >
		<Filter name = "biotype" value = "miRNA"/>
		<Attribute name = "ensembl_gene_id" />
		<Attribute name = "ensembl_transcript_id" />
		<Attribute name = "external_gene_id" />
		<Attribute name = "external_gene_db" />
		<Attribute name = "chromosome_name" />
		<Attribute name = "start_position" />
		<Attribute name = "end_position" />
		<Attribute name = "strand" />
		<Attribute name = "gene_biotype" />
		<Attribute name = "transcript_biotype" />
	</Dataset>
</Query>

You can save it as biomart_mirna.xml

And retrieve with 

curl --data-urlencode query@biomart_mirna.xml http://grch37.ensembl.org/biomart/martservice/results > mirna_genes.tsv

The results are slightly different though.

 

 

ADD COMMENTlink modified 5.6 years ago • written 5.6 years ago by Pablo Marin-Garcia1.8k
1

Hi Pablo. Thanks a lot for this !

ADD REPLYlink written 5.6 years ago by Aurelie MLB320
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1833 users visited in the last hour