Create Gene Network analysis in R
1
3
Entering edit mode
8.8 years ago

Hey everyone!

I have a dataframe in R, the columns being Gene names and the rows being Isolate names. The value in each column can be 1 or 0, 1 for "Gene is present in Isolate" and 0 for "Gene not present in Isolate". So all in all, the dataframe shows in one-zero fashion which Isolate has which genes.

Now I want to do a network analysis to see which genes are most likely to co-occur, assess the strength of their connection and so on.

In R, I have tried the following:

>library(igraph)
>library(network)
>library(sna)
>library(ndtv)

>Genematrix <- data.matrix(df)
>g <- network(Genematrix, directed=FALSE)
> summary(g)
>plot(g)

What I get from this is a network object with 236 vertices. But what I actually want is the Genes as vertices ( 21 columns), so I can see the clusters and connections between them. In many tutorials I have seen that I need and edge list and a node list. the edge list is I think what I get from >g <- network(Genematrix, directed=FALSE), but I don't know how to get the node list.

Can anyone explain how to solve my problem and what I have to do to get the network I want?

gene network R • 4.4k views
ADD COMMENT
2
Entering edit mode
8.8 years ago
russhh 5.7k

Cols are the genenames and you want a gene-gene adjacency matrix to feed into igraph/network etc. You could do this with an incidence matrix (rows=vertices, cols = edges) as well, and your current dataframe looks almost like an incidence matrix (but isn't, since some columns may have more than 2 entries, so don't represent edges; and indeed, the vertices you want in your graph are present in the columns)

The following should convert your dataframe into an adjacency matrix (the edge between vertices u and v being weighted by the number of isolates where they were both 1)

m <- as.matrix(df) # note that df is a base function in R, so isn't a very good variable name
adj.m <- t(m) %*% m

Note that adj.m has diagonal entries that are of no value in your analysis so I think you can do something like the following to get rid of them

diag(adj.m) <- 0

Russ

ADD COMMENT
0
Entering edit mode

Works, thank you! I'll read up on the why :-)

ADD REPLY

Login before adding your answer.

Traffic: 1425 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6