I have 60 individuals randomized to treatment with a statin or diet. Statins are drugs that inhibit the liver enzyme HMG-CoA-reductase, with inhibits the synthesis of cholesterol (a major driver of atherosclerosis) leading to lower risk of stroke and myocardial infarction.
Thus 60 individuals are randomized to either a statin or diet. 10 genes were quantified at baseline and after 1 year treatment. For each gene we have a baseline value, 1 year value and the fold (calculated as the 1 year value divided by the baseline value).
The data set looks like (presenting two genes with baseline, 1 year and fold):
genes <- read.table(header=TRUE, sep=";", text = "treatment;IL10_BL;IL10_1Y;IL10_fold;IL6_BL;IL6_1Y;IL6_fold; diet;1.1;1.5;1.4;1.4;1.4;1.1; statin;2.5;3.3;1.3;2.7;3.1;1.1; statin;3.2;4.0;1.3;1.5;1.6;1.1; diet;3.8;4.4;1.2;3.0;2.9;0.9; statin;1.1;3.1;2.8;1.0;1.0;1.0; diet;3.0;6.0;2.0;2.0;1.0;0.5;")
Nothing has been done to the data (e.g log-transform).
I have the following questions:
• How would you calculate if there are differences in gene expression after 1 year in the diet vs statin group?
• Does change in gene expression predict other variables (e.g blood cholesterol); this would need multiple regression analyses, in which I wonder what gene value to use (the baseline, the 1 year of the fold). Should I in that case log-transform the fold/baseline/1year before using it as a predictor?
Any help is much appreciated.
I have read the limma userguide package but my data is not set up in the way limma package uses in the vignette.
Thanks in advanced.