Question: Question about Isoform expression (miRNA) data from TCGA
gravatar for Vasu
15 months ago by
Vasu510 wrote:

I have downloaded the miRNA Isoform expression quantification data (mirbase21.isoforms.quantification.txt) from TCGA. The data looks like below:

miRNA_ID    read_count  miRNA_region
hsa-let-7b       2       precursor
hsa-let-7b       1       precursor
hsa-let-7b       1       precursor
hsa-let-7b       58     MIMAT0000063
hsa-let-7b      173     MIMAT0000063
hsa-let-7b      5723    MIMAT0000063
hsa-let-7b     26947    MIMAT0000063
hsa-let-7b       1        stemloop
hsa-let-7b       1      MIMAT0004482
hsa-let-7b       2      MIMAT0004482
hsa-let-7b      129     MIMAT0004482
hsa-let-7b      401     MIMAT0004482

And based on miRBaseConverter and information from miRTarbase I have the mature regions 3p and 5p information with Accession like below:

miRNAName_v21     Accession
hsa-let-7b-5p    MIMAT0000063
hsa-let-7b-3p    MIMAT0004482

So, based on above information I can sum the counts of mature 3p and 5p. And the main precursor also I can do, but what are stemloops? Do I need to include that with precursor or need to exclude that from analysis?


ADD COMMENTlink modified 15 months ago by i.sudbery10k • written 15 months ago by Vasu510
gravatar for i.sudbery
15 months ago by
Sheffield, UK
i.sudbery10k wrote:

The the maturing of a miRNA is a multi-step process. The extra material at the 3' and 5' end is removed first, and then the loop, which acording to the TCGA pipeline description defined for the purposes of this pipeline as:

  1. stemloop, from 1 to 6 bases outside the mature strand, between the mature and star strand

In the TCGA pipeline, reads are classified heirarchically. Priority is given to reads mapping to mature miRNA, so if it aligns to something else as well as mature, it will be counted as mature (that actaully wouldn't be the way I would do, but never mind). Then to reads mapping to the pre-cursor, and then to the stemloop.

This explains the high levels of counts in the mature, but not the pre-cursor, despite the mature being contained within the precursor.

As you are probably intersted in the biologically active molecule, I would ignore everything other than the mature.

ADD COMMENTlink written 15 months ago by i.sudbery10k

Thanks a lot sudbery. So, now I will sum all the mature 3p and also sum the 5p and exclude precursor and stemloop for the analysis.

ADD REPLYlink written 15 months ago by Vasu510

If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.

ADD REPLYlink modified 15 months ago • written 15 months ago by GenoMax95k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1070 users visited in the last hour