Hello Dear Community,
PIease I need help with the gene expression data from primary tumors and cell lines. In particular, I'm working with gene expression data (mRNA MicroArray) of Glioblastoma.
The data of cell lines, comes from: https://www.cancerrxgene.org/downloads 2 options :
- Expression [Raw] Affymetrix Human Genome U219 array data at ArrayExpress (E-MTAB-3610)
- Expression [Preprocessed] RMA normalised basal expression profiles for all the Cell lines. This one, the only values that has are z-scores where the column regulation defines: zscore < -2 (under) ; zscore >2 (over) ; else: normal.
The data of primary tumors: https://www.cbioportal.org/datasets . [Glioblastoma Multiforme (TCGA, Provisional)] Here I found different type of files, which includes:
- Data Expression
- Data Expression Z-Score
Here's my list of questions:
When I try to reverse the z-score standarization from the primary tumor dataset "data expression z-score" I cant obtain the same values as the one in Data Expression. The same happens the other way around: trying z-score standarization on Data Expression to obtain the values in Data Expression Z-Score. Curious is that the empirical mean and variance it's different from 0 for dataset data expression Z-Score. As far as I know, Z-Score it's substracting for each future (genes) its mean and divide by its variance. Any ideas?
The cell line gene expression dataset "Expression (Preprocessed) RMA normalised" has a normalization called RMA. This normalization it's standard for every gene expression dataset of microarray data?. ¿Can I assume that the data presented in primary tumors (in particular in the dataset Data Expression Z-Score) has already that normalization?.
If I assume that cell lines represent well enough what is happening in primary tumors, what it would be the aprropiate way to append the data of cell lines to primary tumors* gene expression data?. Which dataset with which one?. I was thinking to take away the standarization of the **cell line gene expression with RMA normalization (that reports a "z-score") and append it to raw dataset of primary tumors Data Expression, but I can't if I'm not able to do (1).
Thanks so much!