Question: TCGAs CNV (SNP Array) level 3 data, confusing descriptions
Mdeng510 wrote:

again I am struggling with getting TCGAs data and description together. As the title says, it's about CNV data by SNP Arrays.

Reading the wiki page SNP array-based data it says,

Level 3 data describes regions of the genome that seem to have segmental duplications or deletions in the tumor compared to the normal sample for the patient

To me this sound like the relative CNV values have been calculated for paired (Tumor and Control/Normal) samples. But then I am asking myself, why are there normal samples available? In the case of somatic mutations, tumor and matched normal files do not differ (read C: Tcga: "Tumor, Matched Normal" Vs. "Normal, Matched Tumor"), but here they do. Why are there matched normal samples available, if they already have been used for calculations, respectively why do case and control differ?

Zhenyu Zhang270
Zhenyu Zhang270 wrote:

I don't quite understand your question.  My understanding is that TCGA genotype both tumor and normal of the same patient, and some of the somatic CNV segmentations are controlled with the paired normal germline CNV. 

Exactly. And for those which are controlled with a normal sample, the analysis pipeline is not clear to me.

Let's say for one patient the region chr1:1000-1200 has a seqMean of 0.2 in the normal and 0.7 in the tumor sample. What do I get when download the T and N sample? Reading SNP array-based data, I should get the relative expression, e.g. relativeExpression(N, T) = relativeExpression(0.2, 0.7). But then, why do they still provide the control? And to come back to the somatic SNP example, they solved this issue by providing two identical files for one patient, here the T and N files are the same.

Long story short: What data do I get in each file.

David Wang0
David Wang0 wrote:

I also wonder why are there CNV data for normal samples available since they already have been used for normalization in level 3 data? If I want to identify copy number gain or loss in tumor, should I use both CNV data from tumor and normal to compare? Thanks.


UT MD Anderson Cancer Center
roelverhaak0 wrote:

Level 3 copy number data in TCGA is the tumor relative to the normal, so T-N.

ADD COMMENTlink written 5.9 years ago by roelverhaak0
