Question: miRNA TCGA gene expression does not provide 3p or 5p information
1
gravatar for byun.seyoun
2.2 years ago by
byun.seyoun10
byun.seyoun10 wrote:

I was trying to find target genes based on TCGA level 3 miRNA gene expression. However, TCGA does not provide 3p/5p information. If I do not know, how am I be able to predict target genes? I also have been looking into isoform expression of miRNA if I could figure that out 3p/5p information. I just want to know expression of 3p or 5p . However, there were so many. For example there are so many location information.

miRNA_ID isoform_coords read_count reads_per_million_miRNA_mapped cross-mapped miRNA_region

hsa-let-7a-1 hg19:9:96938224-96938244:+ 2 0.450031 N precursor

hsa-let-7a-1 hg19:9:96938243-96938265:+ 2 0.450031 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938243-96938266:+ 23 5.175355 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938263:+ 75 16.876156 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938264:+ 2037 458.356397 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938265:+ 4664 1049.471889 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938266:+ 28056 6313.032443 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938267:+ 249 56.028838 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938268:+ 6 1.350092 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938244-96938269:+ 10.225015 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938245-96938266:+ 20 4.500308 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938247-96938266:+ 51.125077 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938248-96938265:+ 1 0.225015 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938248-96938266:+ 20.450031 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938248-96938267:+ 1 0.225015 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938249-96938270:+ 1 0.225015 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938249-96938271:+ 1 0.225015 N mature,MIMAT0000062

hsa-let-7a-1 hg19:9:96938295-96938315:+ 51.125077 N mature,MIMAT0004481

hsa-let-7a-1 hg19:9:96938295-96938316:+ 11 2.475170 N mature,MIMAT0004481

hsa-let-7a-1 hg19:9:96938295-96938317:+ 71.575108 N mature,MIMAT0004481

hsa-let-7a-1 hg19:9:96938296-96938316:+ 1 0.225015 N mature,MIMAT0004481

hsa-let-7a-1 hg19:9:96938296-96938317:+ 10.225015 N mature,MIMAT0004481

hsa-let-7a-1 hg19:9:96938296-96938318:+ 1 0.225015 N mature,MIMAT0004481

I wonder I can combine all the mature,MIMAT0004481 and mature,MIMAT0000062 expression (read per million mina) and divide by read_count so I can have just one hsa-let-7a-1-5p and hsa-let-7a-3p.

Thanks

next-gen • 1.0k views
ADD COMMENTlink modified 2.2 years ago by abe30 • written 2.2 years ago by byun.seyoun10
1
gravatar for abe
2.2 years ago by
abe30
abe30 wrote:

The TCGA files ending with mirnas.quantification.txt are for precursor miRNA only, so they won't include arm-level info. TCGA files ending with mirbase21.isoforms.quantification.txt will provide arm-level info as you have already found. If you want to do target prediction with the mature miRNA but not each isoform, then you'll need to manipulate the data in mirbase21.isoforms.quantification.txt.

Your main question is a bit unclear, so I'm interpreting the question to be How can I reduce a multitude of values (read counts or rpm) for hsa-let-7a-1-5p and hsa-let-7a-3p to a single value for each arm-level?

If you just divide the rpm by count as suggested, that will give you the total sum of counts x 1,000,000 per MIMAT. Instead, follow this:

"The sum of read counts by MIMAT accession (by MIMATxxxxxxxx in the miRNA_region column), excluding reads with other annotations, will result in the counts for each mature strand. (Note: In this case, the reads_per_million_miRNA_mapped (rpm) column is not typically summed to represent the rpm value for a MIMAT ID. Instead, the rpm for a MIMAT is generated from the summed counts for a MIMAT divided by the total summed MIMAT read counts for a sample multiplied by 1,000,000.)" Ref https://github.com/bcgsc/mirna

Associated publication

Chu, A., Robertson, G., Brooks, D., Mungall, A. J., Birol, I., Coope, R., Ma, Y., Jones, S., … Marra, M. A. (2015). Large-scale profiling of microRNAs for The Cancer Genome Atlas. Nucleic acids research, 44(1), e3. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4705681/

ADD COMMENTlink written 2.2 years ago by abe30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1149 users visited in the last hour