I wanted to tackle the TCGA CGH (comparative genomic hybridization) data from their Glioblastoma Project. I was surprised by the fact, that they would compare their tumour and normal samples separately to a reference genome! Nevertheless, for a few cases they actually put the tumour and the normal sample on the same chip, what allows a direct comparison of gains and losses between normal and tumor.
So I can get something like this:
barcode chromosome start stop num.mark seg.mean (log2) tumour:TCGA-06-0238-01A-02D-0311-04 3 85466920 85652956 24 -0.7886 normal:TCGA-06-0238-10A-01D-0311-04 3 85458029 85652956 25 -0.7479
Given their high similarity i can assume that their is no loss in the tumour compared to the normal CGH
But I am no quite sure how i should treat the cases where segments are very different in size, the difference between the values are bigger, there is a loss/gain in the normal, but not in the tumour, etc.
I am tempted to just map the segments to genes, and make a substraction, but since they are two different experiments, I doubt this is correct.. any suggestions / thoughts?