Question: Correlated/Coexpressed genes , given the name of a gene.

4

Deepak Tanwar •

**3.9k**wrote:Hi,

Given,I have a gene expression dataset. I want to find out, all the genes that are highly correlated with the given gene/ have the same expression pattern.

Example, if there would be an R package, it should take one input as expression data matrix and another input as gene of interest and should provide an output of a list of genes, that could have same expression pattern or are highly correlated.

I know that I could write a function in R regarding this and filter the genes on the basis of their correlation values and take into account the cor value more than 0.5. But, it takes so much of time, when you have a data with more than 20,000 rows and 1,000 columns.

ADD COMMENT
• link
•
modified 3.9 years ago
by
TriS •

**3.5k**• written 3.9 years ago by Deepak Tanwar •**3.9k**
Could you share your code to see why it is very slow? I gave it a try on my computer with a toy example (20'000 rows and 1'000 columns) and it only took few seconds. You should be able to do this without any problem in R.

1.9khigh_correlation_pval_table<- NULL

# iterate through all genes

for(i in 1:nrow(log2_BRCA)){

# correlation

cor <- cor.test(as.numeric(log2_BRCA[which(substr(rownames(log2_BRCA), start = 1, stop = 6) == "SLC5A5"),]),

as.numeric(log2_BRCA[i,]), method = "spearman", alternative = "two.sided")

# conditions

if(abs(as.numeric(cor$estimate)) < 0.5 | abs(as.numeric(cor$estimate)) == 1 | is.na(abs(as.numeric(cor$estimate)))){

next

}

else{

high_correlation_pval_table <- rbind(high_correlation_pval_table, data.frame(Gene = "SLC5A5", Cor_Gene = rownames(log2_BRCA[i,]),

Correlation_Value = as.numeric(cor$estimate),

p_value = as.numeric(cor$p.value),

stringsAsFactors = F, check.names = F))

}

}

3.9kDeepak, may you help me as already

suppose that i have already downloaded GSE63706 and normalized that and i have a normalized text file now. and i have also a list of probsets (a text file of my interest probsets) from this array...i want to have a heat map showing the expression pattern of my interest probsets in this array, for example in this array i have 4 varieties and different tissues (rind and flesh) and phases (0,10,20,30,40 and 50 days after harvesting). .

3.2kHeatmap is not a problem at all. There is a R package called pheatmap. There is a very easy way to show the above mentioned groups with the heatmap. These are called as the annotations of a heatmap. Checkout this How Do I Draw A Heatmap In R With Both A Color Key And Multiple Color Side Bars?

There are codes as well. You could use any heatmap packages, there a many. pheatmap, heatmap, heatmap3, heatmap.3, heatmap.2, whatever you prefer. Read the above mentioned Biostars link properly. Tryout the example codes properly and then edit them according to your data.

3.9kthank you very much

3.2k