The best single metric for a correlation matrix
1
0
Entering edit mode
2.5 years ago

Hi, I am working on human microbiome and host gene expression data from two disease conditions and I want to investigate the correlation between bacterial abundance and its host (human) gene expression.

For example, I have a correlation table similar to Fig.3A in this paper (below).

https://genomemedicine.biomedcentral.com/articles/10.1186/s13073-020-0710-2

Each cell of this heatmap represents a correlation between host gene expression and matched taxa abundance across samples (Spearman's/Pearson's rho).

I have two questions:

(1) What is the best metric (or score) summarizing all correlations in this matrix?

(2) How can I statistically compare two matrices (from different disease conditions) using this metric (score)?

Any suggestion? Thanks in advance!

correlation • 818 views
ADD COMMENT
0
Entering edit mode
2.5 years ago
n,n ▴ 360

I don't know if I am understanding correctly but it doesn't seem like you need to summarize the matrices. If you just want to see if two matrices are different in terms of the correlations in their cells between two groups, you can just treat the them as two different samples of observations and run a routine statistical test such as Wilcoxon test. Something like this in R:

wilcox.test(as.vector(matrix1), as.vector(matrix2))
ADD COMMENT
0
Entering edit mode

Thanks for your suggestion! One of my concern is that two data sets (disease conditions) have different patient sample numbers (say, n=100, n=300). Is it ok to compare two Spearman's Rho values from this setting?

ADD REPLY

Login before adding your answer.

Traffic: 1376 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6