Question: The Utr Sequence Used To Predicted Mirna Target
gravatar for hua.peng1314
5.2 years ago by
hua.peng131490 wrote:

Dear all,I am a freshman in bioinformatics and I am trying to deal with some miRNA data.I have some question not sure. I download the human 3'UTR sequence form the biomart.As a gene have a series of transcripts, I am not sure to use them all or use the shortest utr. Another thing I am not sure is some genes are on the reverse strand ,should I reverse the utr sequences.Thanks for your replies.

utr mirna target • 2.2k views
ADD COMMENTlink modified 5.2 years ago by IV1.2k • written 5.2 years ago by hua.peng131490
gravatar for IV
5.2 years ago by
IV1.2k wrote:


This is an actual issue in miRNA research and is often overlooked.

The proper thing to do is to identify which isoforms (primary and/or secondary isoforms) are expressed in your cell line or tissue and utilize the exact 3'UTR and CDS for the target identification.

This in many cases is very difficult.

If you have the supporting NGS data (some really good RNA-Seq or RNA-Seq + PolyA-Seq or 3P-Seq) then this type of investigation can be supported.

If you are working with a really small number of genes then this again is also manageable.

If not, then people usually take the dominant transcript from the available annotation (e.g. ENSEMBL) or the transcript with the longest 3'UTR.

Most target prediction programs follow usually one of those two approaches.



ADD COMMENTlink written 5.2 years ago by IV1.2k

Dear IV Really appreciate for your help.I don't know well about miRNA. So I just download some miRNA data sequenced by illumina 2000 from SRA database to learn to deal with as some preparations. Thanks for patient to reply and I have another question, Some genes are on the reverse strand and the UTR sequence are 3'->5' too,Need I to reverse the UTR sequence? Best regards hua.peng

ADD REPLYlink written 5.2 years ago by hua.peng131490

Can you tell me a bit more about what are you trying to accomplish?

We do a lot of miRNA-related analysis in the lab and I might be able to help you more, if I have more info on this.

I'm usually using local genome files but from what I can recall is that ENSEMBL returns the correct UTR sequence regardless of the strand. It automatically reverse-complements the UTR sequence if it's on the (-) strand. However, if you download the coordinates and use the coordinates to get the sequence, then you have to do it yourself.


ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by IV1.2k

Thanks a lot.I download the UTR sequence from the ENSEMBLE BIOMART and the Header Information is liake this ">ENSG00000001630|ENST00000003100|CYP51A1|91741465|91742978|91741465|91763844" the UTR start is the same with the transcript star,So I think it's just the complements of the UTR sequence but not reverse,Now I know I was wrong .It automatically reverse-complements the UTR sequence.By the way.A kindly friend tells me the transcript with the smallest transcript ID of a gene is the dominant transcript. Is it right? Thanks again for your kindness and patient.It really help me a lot. Best wishs

ADD REPLYlink written 5.2 years ago by hua.peng131490

From what I know (and what the relevant help page in ENSEMBL states: ) is that transcript numbers show also the level of curation: Gold transcripts start with 0, Ensembl transcripts start with 2 and Vega/Havana transcripts start with 6. Ensembl suggests to start with the CCDS and gold transcripts and to also crosscheck your trasncripts with EST and other expression data, in order to identify the transcripts expressed in your specific tissue or cell line, since primary transcripts are not constant between different cell types/tissues. ENSEMBL offers a wealth of external identifiers that enable such a task.

ADD REPLYlink modified 5.2 years ago • written 5.2 years ago by IV1.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2475 users visited in the last hour