Entering edit mode
6.3 years ago
zizigolu
★
4.3k
Hi,
If our phenotypes in a table named clinical data is like so
## FamID CAD sex age tg hdl ldl
## 10002 10002 1 1 60 NA <NA> <NA>
## 10004 10004 1 2 50 55 23 75
## 10005 10005 1 1 55 105 37 69
## 10007 10007 1 1 52 314 54 108
## 10008 10008 1 1 58 161 40 94
## 10009 10009 1 1 59 171 46 92
instead of one variable in our clinical table for instance hdl like below
phenoSub$phenotype <- rntransform(as.numeric(phenoSub$hdl), family="gaussian")
how can I use another variables like tg, hdl, ldl at the same time???
I tried so but went wrong
phenoSub$phenotype <- rntransform(as.numeric(phenoSub$c(colnames(clinical)), family="gaussian")
Thank you
Sorry, what I should put when I have > 4000 variables including hdl, ldl, tg and there is no Y like CAD?
I had assumed that you were aiming to predict coronary artery disease (CAD). The
rntransform
function is just attempting to normalise your variables so that they can be more reliably used in modeling.Thus, when you run:
,all that is returned is a normalised version of your hdl variable, which can then be more reliably used in a binomial logistic regression model.
When I run:
,it is normalising the residuals from my model
Here's what the package says:
Thus, if you have no endpoint, my recommendation is to run
rntransform
independently on all of your variables of interest (preferably in a loop), and then run whatever regression models you were originally aiming to do in order to test your hypotheses, presumably relating to SNP genotypes.Further reading:
rntransform
Good luck
Sorry, if I have 4914 variables what would be the loop in this part of code to rntransform all one by one?
whatever I tried I could not figure out
Then newDataFrame will contains your transformed variables. You can then merge this with your genotype data and then test your hypotheses through modelling.
Thank you, you are so kind that share your precious time for solving people’s problems. I will try your kindly provided code. Let me thank you once again
No problem, you may also want to take a look at this study, in which rntransform was also used. However, they appeared to have used the normalisation of regression model residuals: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3400731/
Sorry,
Could one use Gaussian model or another models to predict phenotype for example cholesterol level based on the microarray data??
Yes, that is possible. In that case, you would use a linear model with cholesterol as the y variable:
Then take the statistically significant genes an pt them together into a preliminary 'final' model that would require yet further testing:
Hello friend. I also just replied to your new question: A: Predicting phenotype based on the expression of genes