I have 6 folders. Each one contains 7 datasets of a specific type of cancer (RNA-Seq) and 7 datasets of normal tissue (healthy control for that type of cancer). A total of 84 datasets.
I want to investigate my gene of interest in each type of cancer.
I believe the best strategy is to merge the FPKM columns and perform correlation analyses of my gene of interest (GOI) versus all other genes in both cancer and normal tissues. As a result, I will have 14 correlation tables, one for each type of cancer and one for each control tissue. Then, I can investigate the pathways related to the genes most correlated with my GOI and write my biological interpretation.
Am I in the right direction? Is this a good strategy to answer to my question? My results will be strong enough for a publication?
I could compare the cancer and control datasets to investigate the DEGs and the pathways activated by the DEGs, but my gene of interest doesn't have a strong pattern of gene expression and I'm afraid it won't be in the list of the statistically significant DEGs, that's why in preferring the correlation analysis approach.