Question: Correlated/Coexpressed genes , given the name of a gene.
gravatar for Deepak Tanwar
3.2 years ago by
Deepak Tanwar3.8k
ETH Zürich, Switzerland
Deepak Tanwar3.8k wrote:



Given,I have a gene expression dataset. I want to find out, all the genes that are highly correlated with the given gene/ have the same expression pattern.

Example, if there would be an R package, it should take one input as expression data matrix and another input as gene of interest and should provide an output of a list of genes, that could have same expression pattern or are highly correlated. 

I know that I could write a function in R regarding this and filter the genes on the basis of their correlation values and take into account the cor value more than 0.5. But, it takes so much of time, when you have a data with more than 20,000 rows and 1,000 columns.

correlation R • 1.9k views
ADD COMMENTlink modified 3.2 years ago by TriS3.3k • written 3.2 years ago by Deepak Tanwar3.8k

Could you share your code to see why it is very slow? I gave it a try on my computer with a toy example (20'000 rows and 1'000 columns) and it only took few seconds. You should be able to do this without any problem in R.

ADD REPLYlink written 3.2 years ago by Philippe1.8k

high_correlation_pval_table<- NULL
# iterate through all genes
for(i in 1:nrow(log2_BRCA)){
  # correlation
  cor <- cor.test(as.numeric(log2_BRCA[which(substr(rownames(log2_BRCA), start = 1, stop = 6) == "SLC5A5"),]), 
                  as.numeric(log2_BRCA[i,]), method = "spearman", alternative = "two.sided")
  # conditions
  if(abs(as.numeric(cor$estimate)) < 0.5 | abs(as.numeric(cor$estimate)) == 1 |$estimate)))){
    high_correlation_pval_table <- rbind(high_correlation_pval_table, data.frame(Gene = "SLC5A5", Cor_Gene = rownames(log2_BRCA[i,]), 
                                                                                 Correlation_Value = as.numeric(cor$estimate), 
                                                                                 p_value = as.numeric(cor$p.value), 
                                                                                 stringsAsFactors = F, check.names = F))

ADD REPLYlink written 3.2 years ago by Deepak Tanwar3.8k

Deepak, may you help me as already

suppose that i have already downloaded GSE63706 and normalized that and i have a normalized text file now. and i have also a list of probsets (a text file of my interest probsets) from this array...i want to have a heat map showing the expression pattern of my interest probsets in this array, for example in this array i have 4 varieties and different tissues (rind and flesh) and phases (0,10,20,30,40 and 50 days after harvesting). .

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by F3.0k

Heatmap is not a problem at all. There is a R package called pheatmap. There is a very easy way to show the above mentioned groups with the heatmap. These are called as the annotations of a heatmap. Checkout this How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars?

There are codes as well. You could use any heatmap packages, there a many. pheatmap, heatmap, heatmap3, heatmap.3, heatmap.2, whatever you prefer. Read the above mentioned Biostars link properly. Tryout the example codes properly and then edit them according to your data.

ADD REPLYlink written 2.8 years ago by Deepak Tanwar3.8k

thank you very much

ADD REPLYlink written 2.8 years ago by F3.0k
gravatar for TriS
3.2 years ago by
United States, Buffalo
TriS3.3k wrote:

using the function below is pretty fast using 1500 columns and 30k rows


mat <- replicate(1500, rnorm(30000)) 
gene <- mat[sample(1:30000,1),]

resp <- c()
est <- c()
for (i in 1:30000){
  x <- cor.test(gene, mat[i,])
  if (abs(x$estimate[[1]]) > 0.5){
    resp <- c(resp, x$p.value)
    est <- c(est, x$estimate[[1]])

 user  system elapsed 
   7.69    0.02    7.71  
ADD COMMENTlink modified 3.2 years ago • written 3.2 years ago by TriS3.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 959 users visited in the last hour