I have an RNA-seq cancer dataset for which I applied regularized Cox Regression for survival analysis ('Coxnet' from the 'glmnet' R package) obtaining 20 genes associated with survival (genes with coefficients>0). Now I would like to validate these genes with an independent dataset from TCGA but there is a strong batch effect between both datasets due to different library protocols (total RNA with ribo-depletion vs poly-A selection). My idea is to apply surrogate variable analysis to correct for unwanted variation but I need some help on how to proceed. How can I combine a survival analysis with 'sva'? For the validation should I use 'Coxnet' or 'coxph' from the 'survival' R package?
Thanks for your advice,