Pearson correlation with different row numbers across two datasets?
0
0
Entering edit mode
10 months ago
Faith ▴ 40

I'm trying to do a pearson correlation in R using rcorr or cor.

My code looks as simple as this

correlation_results_tf <- rcorr(as.matrix(tf_numerical), as.matrix(p_numerical), type = 'pearson')
Error in cbind(x, y) : number of rows of matrices must match (see arg 2)


or

cor(as.matrix(tf_numerical), as.matrix(p_numerical))


same here incompatible dimensions

The reason is because tf_numerical which is subbed from the original dataframe is 1K rows and the p_numerical is 500 rows. Is there anyway of me being able to see the correlation between all the gene samples of tf and all the gene samples at p?

I would appreciate the help!

pearson-correlation • 1.1k views
0
Entering edit mode

For correlation, you need a 'like for like' comparison, be it either of:

• the same number of samples
• the sane number of genes / variables
0
Entering edit mode

So I did found a work around. I transposed all the elements, so now the same number of columns in the two dataframes is not united as the row number.

The problem is now, rcorr just concatenates both dataframes together and does correlation amongst all!!! columns, even within the same dataframe.

i used cor() and it worked perfectly, correlating one dataframe with the other, however when I try to get the Pvalues, I keep getting x,y not the same length with cor.test!!

0
Entering edit mode

cor() and cor.test() behave differently. For cor.test(), you can only correlate one vector against another, and then extract the p-value via cor.test(x,y)\$p.value. For cor(), you can correlate a vector against a matrix. If you wish to run cor.test() over a matrix, you need to run it as a for loop, specifying each column on each iteration, or via lapply()