Hi,
I am new in bioinformatics and have a background in neuroimaging where we often use a baseline to build our models. Meaning, all the relationships between variables (correlation or other measurements) are inferred from the difference of activity between or condition of interest and in the baseline. Although these types of differential models are gold standard in my field, I've heard that studies of differential co-expression in RNA-seq is controversed. Does anyone can explain me why (difficulty of choice of a baseline...etc) and/or point me to publications that discuss the topic?
Thank you very much!
Sandrine
Hi! Today I was reading on co-expression networks, but it really depends on how your experiment is and what tools you have at your disposal. For example, seems like WGCNA works really good, however, seems you need quite the number of samples to run a significant analysis. On using differentially expressed genes, here is what is wrote on their FAQ:
I do not know if this is the case for all the tools, but it is definitely something to keep in mind. Cheers :)
Thank you @biofalconch for your answer! Indeed, I can understand their point. However, don't you think that you may have a lot of correlations that happen "by chance" if you are not controlling for random noise (from a baseline) ? I guess when you correlate the modules with a disease for instance, a lot of the genes in the module can be false positives... or am I having an inadequate reasonning?
Yes! It may be bad to leave the whole dataset, and the first part of the same question of the FAQ adresses this (probably shouldn't have left it out). But here it is
So what I got from this is "filter at your own risk"
WGCNA is indeed fundamentally based on correlation - that's how it initially identifies modules. Once identified, it then transforms the module by single value decomposition (i.e. PCA) in order to derive the loadings for each gene to each module. WGCNA is really great in certain situations.