Here I have a feature which was measured in patients from the BRCA dataset of TCGA (each patient has one value for this feature). It is a continuous variable. I want to find a set of genes measured by gene expression of RNA-seq which is highly correlated with this feature (so-called “gene signatures”). My plan is to use stepwise linear regression to find this set of genes.
My questions are:
Is it suitable to use stepwise linear regression here?
Could you please recommend some available package in R to do this job?
Do you have any other better choices to achieve my aim?