I have gene expression data (obtained with microarrays) from 79 individuals. First, I split the data into four groups according to the application of certain treatment (Treatment and control) and a baseline condition (C1 and C2), so I had the following groups: C1_Treatment, C2_Treatment, C1_Control, C2_Control. I performed a differential gene expression analysis looking for differences among these groups and I did not find any effect of other covariates (sex, age, etc). The model used for this analysis was:
design <- model.matrix(~ 0 + Condition + Treatment + Condition:Treatment, data =data)
To determine whether there was an effect of the covariates first I included all potential covariates in the linear model, then I removed one by one each of the variables if the number of the genes influenced by this variable was lower than one at a FDR corrected p-value <0.001.
Then, I performed a second analysis splitting the same data into different groups this time I wanted to know whether there were differences between sexes. The comparisons groups were: C1_Female, C1_Male, C2_Female, C2_Male. I selected the model in the same way as in the first analysis (adding the potential covariates to the model) and I found influence of sex and age in gene expression. Of note, the significance level in this second analysis was a FDR corrected p-value <0.01. Therefore, the model used for the second analysis was:
design <- model.matrix(~ 0 + Condition + Sex + Age + Condition:Sex, data =data)
My question is: splitting the data into different groups and the different significance levels are enough explanation for finding different effects of the covariates in both analysis?