Is the expression value of miRNA precursore is equivalent to it's corresponding mature miRNA form ?
8.7 years ago
jack

I have the miRNA and it's regulatroy target information from TargetScan. all the miRNA IDs (miRBase IDs) from there are belong to mature miRNA form.

For my analysis I need also include the expresion levels of these miRNAs. I]m using TCGA data, but the problem is that, all miRNA IDs in TCGA data samples from RNASeq are for miRNA precursor.

now my questions are :

1. Is the expression value of miRNA precursore is equivalent to its corresponding mature miRNA form?
2. What should I do with the miRNA which represent the stem loop?

For example in TargetScan I have : hsa-miR-18a

but in TCGA miRNA expression file I have :

hsa-mir-18a     : precursore
hsa-mir-18a-1  : stem loop
hsa-mir-18a-2  : stem loop


basically miRNA IDs without capital R in it represent miRNA precursore.

We can go further - the (lack of) capitalisation of "mir" tells us we're talking about the miRNA precursor (here)

RNA-Seq TCGA mirna next-gen • 2.1k views
8.7 years ago
pld

No, the precursor and the mature miRNAs are different things. Imagine the case where all of the enzymes for RNAi are now gone, your precursor miRNAs would still be present. I don't think it is safe to infer anything about mature miRNA product expression levels from precursors.

what is the alternative solution for this scenario ?

I think that the TCGA file and miRBase might have different naming conventions, that is TCGA does not seem to follow the "R" syntax. This seems to be the source of confusion.

In your OP you have precursor and two mature products for mir-18a (the "stem loop" products). These would correspond to your miR-18a entries in miRBase. I'd use BLAT or something to map the TCGA IDs with the miRBase IDs.

Thanks. you mean the hsa-mir-18a-1 and hsa-mir-18a-2 which are stem loop are mature product and correspond to miR-18a in TargetScan?

That is what I think is going on, but it'd be better to check with some sequence alignment just to make sure.

I checked, it's not. e.g; for hsa-mir-3179 in TargetScan there are : hsa-mir-3179-1, hsa-mir-3179-2, hsa-mir-3179-3 in TCGA, which I checked in miRBase, they are stem loop which are precursores

Maybe you're not mapping your reads against the correct database? If I understand correctly TCGA is genome sequences, so it will provide you with miRNA genes (which are transcribed into miRNA precursors), while miRBase will provide you with predicted/validated mature miRNAs:

Each entry in the miRBase Sequence database represents a predicted hairpin portion of a miRNA transcript (termed mir in the database), with information on the location and sequence of the mature miRNA sequence (termed miR). Both hairpin and mature sequences are available for searching and browsing

I think TCGA would be better if you wanted to see if there are mutations in known miRNAs, while miRBase is better for determining which mature miRNAs you have. Generally the reads in miRNA-Seq will be longer than the mature miRNA, so the references you align/map/count your reads against matter. I think this is what Chirag was getting at.

Basically, I'm building miRNA-target regulatory network. for this reason, I need the expression level of miRNA from TCGA. but the IDs in TCGA for miRNA are miRBase IDs, and the same from TargetScan, but I have the problem which I mentioned it in OP.

8.7 years ago
Chirag Nepal

A single pre-miRNA can have both of its arm as mature arm (denoted by -5p and -3p). The solution is to read count for mature mirna, as mirna-seq read counts are often 18-24 nt long.

Can you elaborate your method more?

I just have expression level for miRNA precursors.

For eg: hsa-miR-1. IF you download the data from miRbase, you can download either mature-miRNA coordinates, or u can download pre-miRNA hairpins (which contains both arm and overhang sequences). Download both and get read counts for both. You will see the difference. In cases when u see a major difference, upload your (wig-file )data on genome browser and see where and how the distribution of reads are along the pre-miRNA hairpins.

