calculation of Z-score from Isoform expression data
0
0
Entering edit mode
7.4 years ago
sumithra.das ▴ 10

Dear all,

I'm using level 3 RSEM normalized data from GDAC Firehose,

I want to compare the expression of an isoform across tumor samples of an cancer by calculating Z-score. so, my question is, to calculate Z-score of an Isoform (X), what should be the reference population & how to calculate the mean & std.dev of that reference population. Like, should i consider the average & std.dev of all isoforms in the library or average & std.dev of only isoform(X) in all tumor samples.

which formula to be considered ?

Z_Score = (expression of X in a tumor sample (s)) - (mean expression of X from all tumor samples (population) / (standard deviation of X of all tumor samples)

or

Z_Score = (expression of X in a tumor sample (s)) - (mean expression of all isoforms (+73K) from all tumor samples(population) / (standard deviation of all isoforms of tumor samples)

Thanks!

sumithra

RNA-Seq • 2.5k views
0
Entering edit mode

Why don't you use scale function in R?

http://stat.ethz.ch/R-manual/R-devel/library/base/html/scale.html

0
Entering edit mode

sorry but how does scaling and centering help in Z-score calculation. plz explain.

thanks

0
Entering edit mode

Z-score = scaling and centering

0
Entering edit mode

thank you, but for scaling and centering what should be my reference population??

0
Entering edit mode

What do you mean exactly with reference population?

The scale function works per column. Per column a mean is calculated and the standard deviation, all data from that column is corrected with (or better scaled to) these values to z-scores.

If you want row z-scores, you'll have to transpose your matrix first

transposed_matrix <- t(matrix)

0
Entering edit mode

thanks for that, the data i'm working is within sample normalized, so for comparing a single isoform (X) expression across different samples i wanted to calculate Z-score. To do so, should i consider only the mean and standard deviation of X or mean and standard deviation of all isoforms( +73K isoforms) of all the samples

0
Entering edit mode

Sorry but I don't exactly understand how your data looks like. What does +73K isoforms mean? More than 73.000 isoforms? Why do you want to include this in your z-scores?

0
Entering edit mode

Thanks for your reply, b.nota. i think i better understand Z-score now.