I have a large heterogeneous dataset (n~1000) with gene expression and weight. The age ranges vary from 18-80 years old, there are both men and women. I would like to find genes that correlate with weight regardless of sex and age. Because there are differences in weight between men, women and some age ranges, I have decided to stratify the samples, run Spearman's correlation using sapply on each group and look for common significant genes. However, this also reduces sample size of each test. Is there any other way to deal with these confounding factors besides stratification?