Tutorial:Bulk RNA-seq: Protein-Protein interaction (PPI) analysis by String-db
3 months ago
Julia Ma

STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations; they stem from computational prediction, from knowledge transfer between organisms, and from interactions aggregated from other (primary) databases.

Here we produce a tutorial that use python to construct protein-protein interaction network

Colab_Reproducibility: https://colab.research.google.com/drive/1ReLCFA5cNNcem_WaMXYN9da7W0GN4gzl?usp=sharing

import omicverse as ov

Prepare data

Here we use the example data of string-db to perform the analysis

FAA4 and its ten most confident interactors.

FAA4 in yeast is a long chain fatty acyl-CoA synthetase; see it connected to other synthetases as well as regulators.

Saccharomyces cerevisiae

NCBI taxonomy Id: 4932

Other names: ATCC 18824, Candida robusta, NRRL Y-12632, S. cerevisiae, Saccharomyces capensis, Saccharomyces italicus, Saccharomyces oviformis, Saccharomyces uvarum var. melibiosus, lager beer yeast, yeast


Besides, we also need to set the gene's type and color. Here, we randomly set the top 5 genes named Type1, other named Type2


STRING interaction analysis

The network API method also allows you to retrieve your STRING interaction network for one or multiple proteins in various text formats. It will tell you the combined score and all the channel specific scores for the set of proteins. You can also extend the network neighborhood by setting "add_nodes", which will add, to your network, new interaction partners in order of their confidence.

stringId_A stringId_B preferredName_A preferredName_B ncbiTaxonId score nscore fscore pscore ascore escore dscore tscore
0 4932.YBR041W 4932.YKL182W FAT1 FAS1 4932 0.69 0 0 0 0 0 0 0.69
2 4932.YBR041W 4932.YPL231W FAT1 FAS2 4932 0.692 0 0 0 0 0 0 0.692
4 4932.YBR041W 4932.YOR081C FAT1 TGL5 4932 0.7 0 0 0 0 0 0 0.7

STRING PPI network

We also can use ov.bulk.pyPPI to get the PPI network of gene_list, we init it at first


Then we connect to string-db to calculate the protein-protein interaction


We provided a very simple function to plot the network, you can refer the ov.utils.plot_network to find out the parameter


