Entering edit mode
3.4 years ago
jansha.1997
•
0
Hey everyone. I have a dataset comprising of patient data and their biochemical test results. I have numerical variables and categorical variables like YES/No. I want to run numerical tests like Annova and others into this data set. But I also have categorical variables with values like ( Mother, father, both and none). I have used boolean values to imply the Yes/No values but I don't know what to do with categorical variables with more than binary values and more than 2 values. And I am confused on how to run numerical tests for those? Please help me out.
What question are you trying to answer? Running some sort of regression on the data using all the predictor variables might be a better approach.
The tests are basically to find the difference between 2 cohorts. And I have made 2 subsets of each cohort from the whole data but I don't know how to recode the categorical variables with more than 3 types of values to numericals. For eg: Normal and abnormal I could recode by 0 and 1 but values like mother, father, both, I don't know how to recode them to numerical to run the test.
If you only have two cohorts you could try a logistic regression that includes all of your variables. In R you can leave categorical values as characters or an ordered factor.
Having a unified regression model is generally better anyway, because it tests for each variable while holding the other variables constant.