I am a PhD student in biochemistry, and I am learning about gene expression signature. My lab generated a 36-gene mouse signature. These genes are all highly expressed. I am interested in identifying "mouse-like" human samples from a large set of primary breast tumors.
I was wondering if someone could please give me the general guidelines on how to apply a gene signature. I can write code, but don't understand the principles (I am reading tough). Is it based on the gene names and their fold-change or just the names? I am sorry for asking such a basic question, but I am learning this aspect of bioinformatics. I read that a naive Bayes classifier is a good idea? Alternatively, ranking the samples (based on how well they express the signature) and using bootstrap resampling?
I would also greatly appreciate to be redirected to a former post or tutorial.