Hi all, I'm a wetlab biologist who's a bit of a novice when it comes to computational algorithm. I'd liken my level to that of a code-monkey: able to run tools made by other people, but incompetent at making new stuff/synthesis.
I am wondering if someone can point me in the right direction for this problem I'm having: I have RNAseq data of multiple cell types, and have generated a list of all the transcription factors in these cell types. I'd like to make a sort of classifier/decision tree of transcription factors that in combination specify each cell type. I've used things like standard decision trees (using WEKA), etc, which work quite well at giving me a decision tree. However, I'd like an exhaustive list. A standard decision tree gives only one possible classifier, whereas I'd like a list of say, 30 genes that in combination is a unique fingerprint to each cell type. I accept that the list will be very large, depending on the threshold that I set for a TF to be "expressed".
Can anyone suggest to me what kind of algorithm (or even better, tool!) that I can use for this?
Much appreciated!
Brian