I would like to use the masked copy number segment from TCGA found on Xena browser and correlate it with gene expression values. Both data sets can be found here https://xenabrowser.net/datapages/?cohort=GDC%20TCGA%20Liver%20Cancer%20(LIHC)&removeHub=https%3A%2F%2Fxena.treehouse.gi.ucsc.edu%3A443
Xena webtools itself enables this sort of correlation with plausible results - CNV values often do correlate with expression level. On the other hand, I see that in the GISTIC pipeline only genes with a value =< -0.3 or => 0.3 are considered as deleted/amplified.
I see a few other papers that did something similar, but I would like to know if a statistical test like this makes sense and, if so, what sort of expression normalization would be the best.