A function for having a multiway intersection
1
0
Entering edit mode
3.9 years ago

Hi,

I have time points , I have differentially expressed genes for each possible combination of these time points too. So that etc. How I can have a code to extract common genes from each possible pairwise combinations of these lists?

R Venn diagram intersection • 1.3k views
1
Entering edit mode
0
Entering edit mode

Hello jivarajivaraj!

It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/4533

This is typically not recommended as it runs the risk of annoying people in both communities.

0
Entering edit mode
3.9 years ago

In R, just use the intersect() function as in

intersect(2h, c(4h, 6h, ...))


where xh is replaced by the corresponding gene set.

Edit: I realized that you may actually mean 2h vs 4h then 2h vs 6h and so on, in which case, just iterate intersect() over all relevant combinations.

0
Entering edit mode

Thank you. The number of genes in each of these lists are not equal. For example there are 630 common genes between h2 and h4 but i have 1143 genes in h2 list and 768 genes in h4 list , so h2 vs h4 would be 630/1143 and h4 vs h2 would be 630/768. That would would be great if i have a matrix or plot of all possible pairwise combinations of my 8 time points.

intersect(2h, c(4h, 6h, ...))

list()

3
Entering edit mode

What you're looking for is essentially a matrix of similarity between sets. A typical measure is the Jaccard index which can be easily computed like this:

jaccard_index <- function(x,y) {
intersect <- length(intersect(x,y))
similarity <- intersect / (length(x) + length(y) - intersect)
return(similarity)
}


You can easily compute the similarity matrix with for loops:

sets <- list(2h, 4h, 6h, 8h, 10h, 12h, 14h, 16h)
S <- matrix(NA, nrow = length(sets), ncol = length(sets)
for(i in 1: length(sets)) {
for(j in i:length(sets) { # matrix is symmetric so only compute the top triangular part
S[i,j] <- jaccard_index(sets[i],sets[j])
S[j,i] <- S[i,j]
}
}

0
Entering edit mode

sorry says that

Error: object 'i' not found


I have a vector of characters for each time point putting them in list and i run your function

> str(sets)
List of 9

>

0
Entering edit mode