Question: Relative microRNA comparison from from TCGA data?
0
gravatar for TJ
5.9 years ago by
TJ70
United States
TJ70 wrote:

 

I have a conceptual question that I was hoping someone could answer.

Can I say that microRNA A is expressed x-fold greater than microRNA B directly from the TCGA miRseq data? Can I do this after normalizing the data? Does it matter if I use RSEM or RPKM values. It seems to me that it should be legitimate in any case since microRNAs are approximately the same length, but maybe I am overlooking something.

For example, I am following a paper published in Nature Communications entitled "Identification of a pan-cancer oncogenic microRNA superfamily anchored by a central core seed motif". The authors download the data and collapse isoform reads to a single read count using the reads. They say they used the reads per million microRNAs mapped, which establishes each microRNA read count as a fraction of the total microRNA population. The authors then do upper quartile normalization which they say is important because a subset of microRNAs (miR-143 in particular) contributes so significantly to the total read count. In the text, the authors appear to use the resulting values to do a direct comparison between microRNAs.

I definitely want the collapsed isoforms, and I think it makes sense to do the normalization. However, I would like to say that a particular microRNA is expressed x-fold higher than another. Can I do this from the collapsed and normalized data?

If this has already been answered, I apologize. I could not find it. Thanks.

next-gen R • 2.8k views
ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by TJ70
2
gravatar for Renesh
5.9 years ago by
Renesh2.0k
United States
Renesh2.0k wrote:

After alignment the miRNASeq read sequences to target sequences (miRNA database), you can calculate the expression of each miRNA under two conditions (fold change). It is usually better to normalize your data to represent fold change. There are several normalization methods and among them RPKM and/or FPKM is a popular method.

ADD COMMENTlink written 5.9 years ago by Renesh2.0k

Thanks PyPerl for answering!

Are you suggesting that I first normalize the data (e.g. by RPKM or upper quantile normalization) and then use some R bioconductor package to calculate the fold-change based on the normalized data?  Would I be using the package to compare microRNA A versus microRNA B for fold-change instead of doing differential expression analysis between conditions (e.g. cancer versus normal) for a single microRNA?  I have done the latter, but I had not thought about doing the former.  Thanks again! 

ADD REPLYlink written 5.9 years ago by TJ70

Get the counts by aligning the query sequences to reference sequences and normalize it into RPKM or any other method of choice. Then use any R package for differential expression analysis. Then you can easily compare by looking at fold change and P-value and/or q-value for candidate miRNA. This is the preferred method for differential expression analysis for two conditions. You can google the package that calculate counts and normalize the data for differential expression.

ADD REPLYlink written 5.9 years ago by Renesh2.0k
0
gravatar for TJ
5.9 years ago by
TJ70
United States
TJ70 wrote:

edited because I moved my request for clarification to Add Comment

ADD COMMENTlink modified 5.9 years ago • written 5.9 years ago by TJ70
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2396 users visited in the last hour
_