Question

TCGAs CNV (SNP Array) level 3 data, confusing descriptions

1

Entering edit mode

9.7 years ago

Mdeng ▴ 520

Hello everyone,

again I am struggling with getting TCGAs data and description together. As the title says, it's about CNV data by SNP Arrays.

Reading the wiki page SNP array-based data it says,

Level 3 data describes regions of the genome that seem to have segmental duplications or deletions in the tumor compared to the normal sample for the patient

To me this sound like the relative CNV values have been calculated for paired (Tumor and Control/Normal) samples. But then I am asking myself, why are there normal samples available? In the case of somatic mutations, tumor and matched normal files do not differ (read here), but here they do. Why are there matched normal samples available, if they already have been used for calculations, respectively why do case and control differ?

With all the best,
Mario

cnv level3 tcga unpaired paired • 3.9k views

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 9.7 years ago by Mdeng ▴ 520

Ram · Answer 1 · 2014-08-20

0

Entering edit mode

9.7 years ago

Zhenyu Zhang ★ 1.2k

I don't quite understand your question. My understanding is that TCGA genotype both tumor and normal of the same patient, and some of the somatic CNV segmentations are controlled with the paired normal germline CNV.

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 9.7 years ago by Zhenyu Zhang ★ 1.2k

0

Entering edit mode

Exactly. And for those which are controlled with a normal sample, the analysis pipeline is not clear to me.

Let's say for one patient the region chr1:1000-1200 has a seqMean of 0.2 in the normal and 0.7 in the tumor sample. What do I get when download the T and N sample? Reading SNP array-based data, I should get the relative expression, e.g. relativeExpression(N, T) = relativeExpression(0.2, 0.7). But then, why do they still provide the control? And to come back to the somatic SNP example, they solved this issue by providing two identical files for one patient, here the T and N files are the same.

Long story short: What data do I get in each file.

ADD REPLY • link updated 3.1 years ago by Ram 43k • written 9.7 years ago by Mdeng ▴ 520

Ram · Answer 2 · 2014-10-27

0

Entering edit mode

9.5 years ago

David Wang • 0

I also wonder why are there CNV data for normal samples available since they already have been used for normalization in level 3 data? If I want to identify copy number gain or loss in tumor, should I use both CNV data from tumor and normal to compare? Thanks.

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 9.5 years ago by David Wang • 0

Ram · Answer 3 · 2014-10-28

0

Entering edit mode

9.5 years ago

roelverhaak • 0

Level 3 copy number data in TCGA is the tumor relative to the normal, so T-N.

ADD COMMENT • link updated 3.1 years ago by Ram 43k • written 9.5 years ago by roelverhaak • 0