Question: What is the relationship between library size and normalization factor?
2
gravatar for Deepak Tanwar
3.3 years ago by
Deepak Tanwar4.0k
ETH Z├╝rich, Switzerland
Deepak Tanwar4.0k wrote:

I saw this plot

doi.org/10.3389/fgene.2016.00164

doi.org/10.3389/fgene.2016.00164

Normalization factors for the fruit set RNA-Seq data depending on corresponding library sizes. All three studied normalization methods are carried out with default settings. For all three methods, regression (dashed) lines are estimated from a simple linear regression modeling the relationship between default normalization factors and library sizes. Color key: TMM, RLE, and MRN are respectively colored in green, blue, and red. Key to symbols: Bud, Ant, and Pos stages are respectively drawn with circles, squares, and triangles.


Question: What is the relationship between library size and normalization factor? What does it mean if the regression line have R^2 of 0.9?

ADD COMMENTlink modified 2.1 years ago by elie.maza0 • written 3.3 years ago by Deepak Tanwar4.0k
1
gravatar for Santosh Anand
3.3 years ago by
Santosh Anand5.1k
Santosh Anand5.1k wrote:

Q: What is the relationship between library size and normalization factor?

The answer is right there if you read a bit further:

"Indeed, it is known that TMM normalization factors do not take into account library sizes. This fact is illustrated in Figure 1 by an almost horizontal regression line. On the contrary, RLE and MRN factors are closer to each other, and share a positive correlation with the library size."

Q: What does it mean if the regression line have R^2 of 0.9? A regression (linear regression here) R2 tells how good the curve (here line) fits is to your data. If all the data are on line, R2 = 100. You can also think this in term of correlation. Correlation means "how good" one variable can be predicted from another variable. In fact, the goodness of fit R^2 is numerically equal to the square of Pearson correlation (rho).

R2 = 0.9 => rho (Pearson correlation) = sqrt(0.9) = 0.94

By looking either of the numbers (R^2 or rho), you can conclude that there is a very good (linear) correlation among two variables and one can be almost perfectly predicted from other. By looking at the line (red or blue line, say), you can easily see that when one variable increases, the other too (in mathematical term, the slope of the line is +ve). This information is also conveyed by the sign (positive) of R^2.

ADD COMMENTlink written 3.3 years ago by Santosh Anand5.1k
1

Thank you Santosh Anand for your reply. I do understand what you wrote. But, what I intended to ask is, what does this mean?

I understand that there is very good (linear) correlation among two variables and on variable can be predicted from other. What's Biological interpretation?

ADD REPLYlink written 3.3 years ago by Deepak Tanwar4.0k
0
gravatar for elie.maza
2.1 years ago by
elie.maza0
elie.maza0 wrote:

Some normalization methods take into account the libray size in the calculation of their normalization factors, and other methods do not. That is the difference between RLE and MRN methods on the one side, and TMM an the other side. Nevertheless, the egdeR package (which uses TMM) also take into account the library size to normalize but this do not appear in their "normalization factors".

Finally, the correlation coefficient hasn't really a "biological" meaning but a "statistical" one. Indeed, it only shows that some normalization factors are linked with the library size and others are not.

ADD COMMENTlink written 2.1 years ago by elie.maza0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 775 users visited in the last hour