[statistics question] What's the right process to find and use interactions between variables?
2
0
Entering edit mode
7.6 years ago
moxu ▴ 510

Suppose you want to do a linear regression: y = x1 + ... + x6, and you think there is interactions among the predictors. You can do an exhaustive trial-and-error test for interactions, but that would be multi-testing and thus not a good idea, right?

What would be the right procedure to find and use such interactions?

Thank you!

R • 949 views
ADD COMMENT
2
Entering edit mode
7.6 years ago
Steven Lakin ★ 1.8k

You will have to examine how their interaction affects the predicted variable; as far as I know, there isn't another way to see if interactions exist. This is part of the punishing aspect of classical statistics: the more thorough and curious you are about different aspects of your experiment, the larger your p-value becomes due to multiple testing.

If you're inclined, Bayesian statistical approaches to the generalized linear model are the way to go for investigation of many combinations of variables, since your intentions as the experimenter have no effect on the posterior distribution, and there is only one posterior distribution, which you can examine in as many ways as you like. At the bottom of this page is a link to Krushke's scripts for JAGS and Stan (Markov Chain Monte Carlo samplers), which include procedures for Bayesian GLMs.

If you do find an interaction, clearly the variables are not independent. Whether you think an interaction is important depends on how it affects the predicted variable and what you're trying to accomplish. If you find an interaction of interest, including it in your model is generally the correct way to proceed.

Biostars probably isn't the best place to get this information; check out the stats stack exchange for access to people who are very knowledgeable in this area.

ADD COMMENT
0
Entering edit mode

Thanks so much, Steven! Very helpful! I will do some study for Bayesian GLM.

ADD REPLY
0
Entering edit mode
7.6 years ago
LLTommy ★ 1.2k

I think you are simply looking for correlation...?

In statistics, dependence or association is any statistical relationship, whether causal or not, between two random variables or two sets of data. Correlation is any of a broad class of statistical relationships involving dependence, though in common usage it most often refers to the extent to which two variables have a linear relationship with each other

Taken from wikipeadia, where you can also find some statistical tests you might want to use to check it

ADD COMMENT
0
Entering edit mode

Not really, but trying to add interaction among the xi's to the linear regression.

ADD REPLY

Login before adding your answer.

Traffic: 2228 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6