Question: filtering genes by pearson correlation
0
gravatar for mannoulag1
2.1 years ago by
mannoulag160
mannoulag160 wrote:

Hi biostars,

I did a pearson correlation to my data (expression matrix), and I keep only the correlation >0.8 . How can I obtain the sub expression matrix of only these highly correlated genes. Thank you

data<-t(matrix)
cor = cor(data, use="pairwise.complete.obs", method="pearson")
cor<-cor[abs(cor)>0.8]
rna-seq R cor() • 845 views
ADD COMMENTlink modified 2.1 years ago by Jean-Karim Heriche22k • written 2.1 years ago by mannoulag160
5
gravatar for Jean-Karim Heriche
2.1 years ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche22k wrote:

Extract the indices of the genes of interest with which() and the arr.ind option, e.g.

idx <- which(abs(cor)>0.8, arr.ind = TRUE)
correlated.genes <- data[idx, ]
ADD COMMENTlink written 2.1 years ago by Jean-Karim Heriche22k

Thank you Jean-Karim, I do this :

#cor is symmetric, so we can keep only the half of the pairs of indices
idx<-which( (abs(cor) > 0.8) & (upper.tri(cor)), arr.ind=TRUE)
correlated.genes <- matrix[idx, ]

Then I have to remove the duplicated genes from 'correlated.genes' ?

ADD REPLYlink written 2.1 years ago by mannoulag160
1

This was just to give you quick pointer. What I think you want is to get unique indices. Something like:

idx <- which( (abs(cor) > 0.8) & (upper.tri(cor)), arr.ind=TRUE)
idx <- unique(c(idx[, 1],idx[, 2])
correlated.genes <- matrix[idx, ]
ADD REPLYlink written 2.1 years ago by Jean-Karim Heriche22k

Thank you Jean-karim

ADD REPLYlink written 2.1 years ago by mannoulag160
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2083 users visited in the last hour