Question: Tcga Lack Of Controls - Workarounds?
gravatar for dirigible2012
6.2 years ago by
European Union
dirigible2012320 wrote:

I would like to analyse methylation differences in cancer cells using the TCGA data, but when I browse the metadata I see that only one of the sets (bladder cancer) has matched normals, and only for a handful of cases at that.

Has anybody else run into this problem, and how did you solve it? Is it valid to use normal tissue from another data source?

Thank you,


tcga methylation • 2.8k views
ADD COMMENTlink modified 6.2 years ago by Chris Miller21k • written 6.2 years ago by dirigible2012320

How about pooled normal? Try to select the set of normals to be as similar (in Biological or Computational sense, depends on purpose) to the tumor as possible. However, it is not ideal without matched normal.

ADD REPLYlink written 6.2 years ago by xb420

Pooled normal would be great, and there's actually a data category in TCGA for that, but very few tumour types actually have their corresponding normals. I suppose the problem is what Chris is saying below - very few people are going to consent to having even tiny bits of healthy tissue removed! (Although I'm finding this so annoying I'm tempted to volunteer...)

ADD REPLYlink written 6.2 years ago by dirigible2012320
gravatar for Chris Miller
6.2 years ago by
Chris Miller21k
Washington University in St. Louis, MO
Chris Miller21k wrote:

It's often difficult to get appropriate matched normals for tumor methylation or expression data, as they'd have to be tissue-matched. If you're working with glioblastoma, you can't take a chunk of a patient's healthy brain. Same goes for blood cancers - it's really difficult to separate leukemic cells from non-leukemic cells and get a matched normal from the same patient. (for genomic DNA calls, you can just use non-proximal blood or skin)

Your best bet is to find healthy samples of that tissue type from another source. I'd start with GEO, and would definitely consider pooling the normals to help smooth out differences specific to only one of your normal samples. Also beware of batch effects, since it's likely that different facilities generated the data.

ADD COMMENTlink written 6.2 years ago by Chris Miller21k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1466 users visited in the last hour