Question: Building Gene Regulatory Networks From Literature
gravatar for Diana
8.6 years ago by
Diana840 wrote:

Hi everyone!!

Does anyone know of any available software that would build gene regulatory networks from literature alone?


gene software network text • 4.3k views
ADD COMMENTlink modified 8.6 years ago by Alex Paciorkowski3.4k • written 8.6 years ago by Diana840

I have heard of groups recently that manually or semi-manually encode regulatory networks from literature. There are a couple of software packages and format specifications for it. But maybe you are asking about automated regulatory networks from high-throughput gene expression assays or proteomics assays.

ADD REPLYlink modified 8.6 years ago • written 8.6 years ago by 141341254653464453.5k
gravatar for Casey Bergman
8.6 years ago by
Casey Bergman18k
Athens, GA, USA
Casey Bergman18k wrote:

There was some work done on this by Saric et al a few years back that was implemented in STRING:

Large-scale extraction of gene regulation for model organisms in an ontological context.

Extraction of regulatory gene/protein networks from Medline.

The RegulonDB team also did some work in this area as well:

Automatic reconstruction of a bacterial regulatory network using Natural Language Processing.

Unfortunately, as is the case with many text mining papers, no software is made publicly available for either of these systems, but you could use some of the resources to reconstruct a related system for yourself.

ADD COMMENTlink written 8.6 years ago by Casey Bergman18k
gravatar for Alex Paciorkowski
8.6 years ago by
Rochester, NY USA
Alex Paciorkowski3.4k wrote:

Diana, the issue with building any gene regulatory network based solely on literature is the amount of hand curation that still must go into your dataset. Otherwise, the resulting network may have little biological meaning. Sometimes what is reported in the literature is true only during a specific developmental context (ie transcription factor 1 turns on transcription factor 2 only during weeks 4-10 of development of the organism in your area of interest... otherwise the 2 don't interact... could be a problem if some of the papers your algorithm mines include data from the adult end of the organism lifespan -- when the genes don't interact...), other times the literature is wrong, or when data on a gene was published no one knew there were actually 3 closely related genes, not just one (I just finished a project where this was the case, so a lot of the old expression data on gene FOO is a mix of what we now know are later-discovered genes FOOA, FOOB, and its cousin FOOC.) And so on. Any network reconstruction project most wisely begins with a phase of expert curation of the dataset to be analyzed, to make sure you have apples with apples, and oranges elsewhere. After that, manually checking your algorithm output ("Is our algorithm finding known interactions, that we know to be true? If not, why not?") is also important. Otherwise you end up with an undigestable hairball of dubious biological relevance, or end up including things like "RNA polymerase" as a critical hub... Of course, having a last phase with wet-lab biological validation of at least key interactions in your network is also important.

I guess my main message is that network reconstruction is more than just a question of software, it's a fairly complex undertaking by a team with various areas of expertise.

Eric Davidson wrote an elegant book (The Regulatory Genome) on gene regulatory networks, and the amount of downstream validation of the predictions in that work is truly impressive.

ADD COMMENTlink written 8.6 years ago by Alex Paciorkowski3.4k
gravatar for Duff
8.6 years ago by
United Kingdom
Duff660 wrote:

Hi Diana

You could look at the Agilent Literature search plugin ( in Cytoscape ( This takes a list of gene ids and constructs a network using co-citation I believe. If you're using Cytoscape you can carry out all sorts of network comparisons, overlaps etc etc.



ADD COMMENTlink written 8.6 years ago by Duff660
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1735 users visited in the last hour