Question: Prediction versus Association: how to evaluate associated traits?
gravatar for Iryna Nikolayeva
4.9 years ago by
Iryna Nikolayeva30 wrote:

I wonder how to evaluate the performance of the tools that associate a variable (example: SNP, gene...) with a phenotype. In a few papers [1,2], people would wonder whether variables, that have been significantly associated with a phenotype improve prediction of that phenotype. Is this a correct way to evaluate the associated traits? What does it mean if an associated trait doesn't improve prediction of the phenotype?

I have also a few related questions to the subject:
1) Is there a difference in "properties" of variables that are good for prediction and those that are good for association (example : variability in between patients)?

2) Why would we sometimes go for techniques that associate a variable (example: SNP, gene) to an outcome variable (example: phenotype), rather than a technique that improves prediction?

Thank you a lot in advance for your responses!


[1]Dufresne, L. et al. (2014). Pathway analysis for genetic association studies: to do, or not to do? That is the question. BMC Proceedings doi:10.1186/1753-6561-8-S1-S103,

[2]Staiger, C., Cadot, S., Kooter, R., Dittrich, M., Müller, T., Klau, G. W., & Wessels, L. F. a. (2012). A critical evaluation of network and pathway-based classifiers for outcome prediction in breast cancer. PloS One, 7(4), e34796. doi:10.1371/journal.pone.0034796)

ADD COMMENTlink modified 4.9 years ago by Devon Ryan93k • written 4.9 years ago by Iryna Nikolayeva30
gravatar for Devon Ryan
4.9 years ago by
Devon Ryan93k
Freiburg, Germany
Devon Ryan93k wrote:

If the association isn't predictive, then either (1) the effect is so small that you have to wonder how relevant it is, (2) the prediction method isn't appropriate for how the variant actually leads to the phenotype, or (3) it's just a spurious finding.

1) To my mind no, but I would defer to others here.

2) In order to test how predictive a finding is you need a separate dataset (or a large enough initial dataset that you can subsample and still have enough power). This can be a deal-breaker. There's also the fact that we don't usually care about the prediction part. In many disease cases, patients are already there's nothing to predict there. Rather, if we can find things associated then we can develop a treatment that targets that change and will hopefully alter the patient's phenotype. Of course, if you want to do screening or to determine response to a treatment then prediction is highly relevant.

ADD COMMENTlink written 4.9 years ago by Devon Ryan93k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1276 users visited in the last hour