I am new here and at the bioinformatics world and I would appreciate your help. I am currently looking into correlating gene expression and CNV data from TCGA, most probably about colorectal or ovarian cancer. After some data exploration, I found out than only a small percentage of samples are from normal tissues. That being said, should the DEGs identification be done only between paired (tumor - normal) samples, even if the statistical power would be low? With the aim of correlating the above mentioned data, a meaningful correlation analysis would be 1. between DEGs and amplified/deleted genes or 2. correlation between the expression (not taking into account differential expression, but all the expression data from tumor samples) and the CNV?
Thanks for helping!