Question: Co-Occurrence Network Graph & Statistics
0
gravatar for biohacker_tobe
4 days ago by
biohacker_tobe20 wrote:

I am trying to make a co-occurrence network graph for my presence/absence data of genes per genomes but am unsure how to go about with it. I'm hoping to end up with something like the first image below,

Where each gene is linked to another gene , considering if they are both present in the same genomes, where possibly a larger circle being used to describe a higher frequency gene. I originally tried using widyr and tidygraph packages but I am unsure that my data is not compatible (see second image), as it has the BGCs as rows and the individual genomes as columns. I am examining the presence/absence pattern of the gene pair to determine if they represent a coincident relationship; basically if gene i and gene j are observed together or apart in the input genomes more often than would be expected by chance.

1) Are there any suggestions on what packages/code I could use that would work with my data set, or how I could adapt my data set to work with these packages?

2) Are there any statistical tests that would be also recommended specifically to assure that there is a coincident or not type relationship?

co-occurrence-network-example
free all over 40 pics

Example of dataset

# Example of data set
# rows = genes
# cols = genomes
set.seed(2222)
df <- matrix(sample(c(TRUE, FALSE), 50, replace = TRUE), 5)
colnames(df) <- letters[1:10]

Binary Data Table Example co-occurrence-network-example2
free all over 40 pics

Thanks in advanced

dataframe networks R • 87 views
ADD COMMENTlink modified 3 days ago by Jean-Karim Heriche21k • written 4 days ago by biohacker_tobe20
3
gravatar for Jean-Karim Heriche
3 days ago by
EMBL Heidelberg, Germany
Jean-Karim Heriche21k wrote:

To address question 1, I would suggest to use the R igraph package. There's an excellent tutorial here. Starting from a binary matrix A that can be considered as the adjacency matrix of the graph, you can do something like:

library(igraph)
G <- graph_from_adjacency_matrix(A)
plot(G)

Here you have a bipartite graph and your matrix is not square so it is not an adjacency matrix but can be considered an incidence matrix. You can expand it to a full adjacency matrix and use the above or you can do:

G <- graph_from_incidence_matrix(A)

Then you just need to style the graph to your liking.

EDIT: Re-reading the question, I see you mean co-occurrence in question 2. There are a number of R packages from different fields that can do co-occurrence analysis from binary matrices such as: EcoSimR from ecology (see the co-occurrence analysis vignette) or quanteda from text analysis (tutorial).

ADD COMMENTlink modified 3 days ago • written 3 days ago by Jean-Karim Heriche21k

As you have mentioned my binary matrix is non-square matrix, is it possible to change this to square matrix? The first method provided gave me an error when trying to run:

Error in graph.adjacency.dense(adjmatrix, mode = mode, weighted = weighted,  : 
  At structure_generators.c:274 : Non-square matrix, Non-square matrix

With the second line of code provided for incidence matrix, this gave no issues at all, just a small warning message.

For question 2 of my post do you think it's possible then to make to obtain something as a P-Value statistic. This value representing an association factor between GCFs, in this case.

I will definitely revise the tutorials provided for the statistical analysis, however is it possible to conduct tests of these kind on non-squared matrices or would it be necessary to convert these to a square? In this case if converted, they will most likely coerce NAs values, can these be converted to some other value?

ADD REPLYlink modified 3 days ago • written 3 days ago by biohacker_tobe20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1251 users visited in the last hour