Hello, everyone I have some questions about RNA-seq clustering and perturbation score. My purpose is to practice to detect disease mechanism on cancer cell lines. My first method , I collected RNA-Seq expression data (includes whole genes of cancer cell lines) from CCLE database. Then I grouped them into modules by using WGCNA methods, so I got gene sets (modules) with similar expression patterns. After that I correlated module eigengene with expression data of cancer cell lines, this step I can consider that what representative mechanism in each cancer cell lines are. The second method But is it different from this method?, clustering cancer cell lines with mutated genes (individual genes with our grouping to gene sets) by using the expression data (the y-axis represents cancer cell lines and x-axis represents mutated genes). So I will know the mutated genes that express in each cell lines (high or low expression) and then calculate perturbation score of cancer cell lines for identifying representative mechanism.
For the first method, I have some ideas because there are a lot of genes in cancer cell lines expression data. If I group the genes that have highly correlated or connected to each others for RNA-Seq expression patterns, they seem to have mutation similarity also (this is my thought why I did the first method) and after that I will detected mutated genes in each gene sets later.
I would like to know whether two methods are different? Thank you so much for your suggestion in advance