Question

What Sample Purity Is Needed For An Accurate Gene Expression Assay

1

Entering edit mode

12.2 years ago

Obi Griffith 20k

I realize this is a question which probably can have no correct answer. But, I am curious if anyone has any experience or can offer any general rule-of-thumb.

Suppose you have a relatively simple RT-PCR based assay of expression levels, normalized against some house-keeping genes, and used to calculate a diagnostic or prognostic score related to patient outcome. A good example would be Oncotype's 21-gene test to predict probability of disease recurrence in breast cancer. Such a test typically operates on thin slices from a tissue block of tumor which has been fixed and preserved (e.g., FFPE). You can expect there to be a certain amount of non-tumor (normal, stromal, etc) tissue. How high does the tumor purity have to be in order to expect usable data? At what point does the addition of noise from non-tumor RNA contamination prevent a useful result? Is anyone aware of studies/experiments investigating this issue?

gene expression • 3.5k views

ADD COMMENT • link updated 12.2 years ago by Neilfws 49k • written 12.2 years ago by Obi Griffith 20k

score 1 · Answer 1 · 2013-04-16

1

Entering edit mode

12.2 years ago

Neilfws 49k

There are quite a lot of studies which include non-tumor contamination in their statistical models. Web search using appropriate keywords will definitely return useful results.

Here's one: Systematic Bias in Genomic Classification Due to Contaminating Non-neoplastic Tissue in Breast Tumor Samples.

ADD COMMENT • link 12.2 years ago by Neilfws 49k

0

Entering edit mode

Thanks for this. To summarize how it addresses my question: The authors looked at the effect of increasing normal contamination on Oncotype, Mammaprint and PAM50 in 55 breast cancer samples with matched normals. While the PAM50 had a predictable/consistent direction of bias that could be corrected for, Oncotype and Mammaprint had unpredictable/inconsistent direction of bias. In other words, for those tests, increasing amounts of normal contamination caused misclassifications in both directions (better/worse risk of relapse). Looking at Figure 1D (Oncotype) as an illustration. You can see that at 90% tumor purity only 3/55 (5.5%) samples are misclassified but this gets steadily worse and increases to 9.1% misclassification at 80% purity, 12.7% at 70% purity and takes a big jump to 21.8% misclassifications at 60% purity. Keep in mind these are miclassifications from Oncotype calls but that those calls in themselves only have a certain accuracy. I suspect that below 60% purity, this amount of misclassifications is making the overall performance of the test not much better than random assignment. The authors state that for "Both of these assays (Mammaprint and Oncotype), in their commercial forms, have implemented quality control strategies to account for tumor nuclei content. These strategies appear important given that these assays misclassify a larnge number of non-neoplastic specimens as more aggressive tumor types".

ADD REPLY • link 12.2 years ago by Obi Griffith 20k