I have RNA-seq samples from two groups (responders / non-responders). I am interested in generating a predictive gene signature which can separate the two groups. Based on a previous post, I have now decided to use lasso-penalized regression or elastic net regression.
So, now I am looking to evaluate this signature.
- First, I can do this with a training and test set.
- Second, I would like to test these in independently generated datasets. RNA-seq datasets but also qPCR.
My question now is how do I do this? The first one is straightforward. Just split the data (80% for building a predictive model, 20% for evaluating the model) and then make prediction on test data. But how can I do this for an independently generated dataset? I cannot directly use the final model on the independent datasets I assume.
Thank you for your help/input!