Hi all,

I have download the mRNA and microRNA NGS data from TCGA for BRCA.
Now I have the expression level of miRNA in diffrent samples.
The question for is that, which kind of microRNA they are quantified ? is it all mature miRNA ? or mirna precursor are also there ? They seqenced miRNA which they got from gel electrophoresis ?(to be sure that, all of them have same length in case of mature miRNA).
But when I look at the miRNA IDs; there is some problem:
for example : they have expression level for hsa-mir-135a-2 , which when I search for it in miRBase, it's stem loop and it's mature form in miRbase is hsa-miR-135a-5p . so now I'm really in trouble to undersrand that expression level of which type of miRNA are quantified ?

would someone clarify it more ?

I believe that each mature transcript (e.g. hsa-miR-135a-5p, hsa-miR-135a-3p) are identified by a mirBase ID with prefix MIMAT. I think to get the mature transcript expressions, you need to parse the data from TCGA miRNASeq isoform quantification files (ending with .mirbase21.isoforms.quantification.txt).

I think the correct way to process the isoform data is to take the max or sum for all counts associated with each mature transcript ID. Here is my script: https://github.com/teng-gao/genomics_utils/blob/master/README.md#process-tcga-mirnaseq-isoform-quantifications

Please correct me if I'm wrong!

Please read through the experimental protocol related to miRNA dataset that you have to determine what type of miRNA is captured. I would guess that they are using random-hexamer priming on RNA isolated from patient tissue after rRNA depletion. This technique would not allow for differentiation between phosphorylated and non-phosphorylated forms. Keep in mind miRNA-seq assays all RNAs at once so they are not selecting a miRNA to focus on / cutting out size from gel

Thanks Yving, I looked at the META data file. they describe like this:

Ligation of linkers and reverse transcription of small RNAs "PCR with sequencing primers, size fractionation" Sequencing on Illumina Alignment of Read counts per mirna isoform Normalized expression per mirna gene

SDRF Files

So what should I do? Now the conclusion is that; the expression level in TCGA data contain all miRNAs?(miRNA stem loop, miRNA precursors, mature miRNA)

Hi! Did you solved this issue? Thank you!