I have TCGA breast cancer RNAseq V2 data and I would type to find the subtypes using the PAM50 gene set. After reading multiple posts, I'm still confused about the process.
It seems like the following is recommended for my problem.
PAM50Preds<-intrinsic.cluster.predict(sbt.model=pam50, data=dataset, annot=dannot, do.mapping=TRUE, verbose=TRUE)
However, I have the following questions.
- Is the model pam50 trained on microarray data, and thus need to be refitted for RNA-seq data?
- Because I have new data, do I first need to use intrinsic.cluster first to fit the model before prediction?
Basically, I want to check whether for my data, I can simply just plug in the model from genefu and predict or if there is a step that needs to come before.