I have RNA-seq samples from two groups (responders / non-responders). I am interested in generating a predictive gene signature which can separate the two groups.
So now I am looking for R packages that could help me with this task. Could you recommend any?
I've used stepwise regression before but this is not feasible in this case with so many variables.
I found a similar A: Resources for gene signature creation where using the DEGs in lasso-penalized regression or to test them independently with cox proportional hazards regression and then pick the top X genes was suggested.
- Could someone point me to a paper / R package / workflow where lasso-penalized regression for such a scenario is described?
- I like the idea to test the DEGs independently with cox proportional hazards regression and then pick the top X genes - I would then feed them into stepwise regression - does this make sense?
- Do you have an alternative suggestion? Classifers such as SVM are an option but this is not my area of expertise...
- I was wondering about the needed sample size for the different approaches. I'd appreciate input here.
Thank you so much!