I am using the TCGA portal to get mRNA expression data for various cancer studies (e.g. lung, liver, thyroid etc). I have two questions about the data:
- Some cancer studies on TCGA have "mRNA expression RNASeq V2 RSEM" values & corresponding "z-scores". I am confused as to what the "mRNA expression z-Scores (RNA Seq V2 RSEM)" data constitutes of. How are the z-scores calculated and what do they represent?
- We have been on a lookout for control dataset for the cancer studies on TCGA. Does anyone know of a good place where you can find control dataset for tissues like Lung, Liver, Thyroid etc. (basically all the fore-gut tissues)? We are working with control data from GTEx but they have RPKM values & TCGA has RSEM/RSEM z-scored values, so we have to do a lot of scaling/normalization/transformation to compare these disparate datasets. We would like to know if there is any mRNA expression data (obtained via RNASeq V2 RSEM) for controls.
UPDATE: I have posted the second part as a separate question here.