Tutorial:Bulk RNA-seq: Protein-Protein interaction (PPI) analysis by String-db
0
1
Entering edit mode
3 months ago
Julia Ma ▴ 120

Content taken verbatim from: https://omicverse.readthedocs.io/en/latest/Tutorials-bulk/t_network/

STRING is a database of known and predicted protein-protein interactions. The interactions include direct (physical) and indirect (functional) associations; they stem from computational prediction, from knowledge transfer between organisms, and from interactions aggregated from other (primary) databases.

Here we produce a tutorial that use python to construct protein-protein interaction network

Colab_Reproducibility: https://colab.research.google.com/drive/1ReLCFA5cNNcem_WaMXYN9da7W0GN4gzl?usp=sharing

import omicverse as ov
ov.utils.ov_plot_set()

Prepare data

Here we use the example data of string-db to perform the analysis

FAA4 and its ten most confident interactors.

FAA4 in yeast is a long chain fatty acyl-CoA synthetase; see it connected to other synthetases as well as regulators.

Saccharomyces cerevisiae

NCBI taxonomy Id: 4932

Other names: ATCC 18824, Candida robusta, NRRL Y-12632, S. cerevisiae, Saccharomyces capensis, Saccharomyces italicus, Saccharomyces oviformis, Saccharomyces uvarum var. melibiosus, lager beer yeast, yeast

gene_list=['FAA4','POX1','FAT1','FAS2','FAS1','FAA1','OLE1','YJU3','TGL3','INA1','TGL5']

Besides, we also need to set the gene's type and color. Here, we randomly set the top 5 genes named Type1, other named Type2

gene_type_dict=dict(zip(gene_list,['Type1']*5+['Type2']*6))
gene_color_dict=dict(zip(gene_list,['#F7828A']*5+['#9CCCA4']*6))

STRING interaction analysis

The network API method also allows you to retrieve your STRING interaction network for one or multiple proteins in various text formats. It will tell you the combined score and all the channel specific scores for the set of proteins. You can also extend the network neighborhood by setting "add_nodes", which will add, to your network, new interaction partners in order of their confidence.

G_res=ov.bulk.string_interaction(gene_list,4932)
G_res.head()
stringId_A stringId_B preferredName_A preferredName_B ncbiTaxonId score nscore fscore pscore ascore escore dscore tscore
0 4932.YBR041W 4932.YKL182W FAT1 FAS1 4932 0.69 0 0 0 0 0 0 0.69
1 4932.YBR041W 4932.YKL182W FAT1 FAS1 4932 0.69 0 0 0 0 0 0 0.69
2 4932.YBR041W 4932.YPL231W FAT1 FAS2 4932 0.692 0 0 0 0 0 0 0.692
3 4932.YBR041W 4932.YPL231W FAT1 FAS2 4932 0.692 0 0 0 0 0 0 0.692
4 4932.YBR041W 4932.YOR081C FAT1 TGL5 4932 0.7 0 0 0 0 0 0 0.7

STRING PPI network

We also can use ov.bulk.pyPPI to get the PPI network of gene_list, we init it at first

ppi=ov.bulk.pyPPI(gene=gene_list,
                      gene_type_dict=gene_type_dict,
                      gene_color_dict=gene_color_dict,
                      species=4932)

<networkx.classes.graph.Graph at 0x1056abb80>

Then we connect to string-db to calculate the protein-protein interaction

ppi.interaction_analysis()

<networkx.classes.graph.Graph at 0x17aa86fd0>

We provided a very simple function to plot the network, you can refer the ov.utils.plot_network to find out the parameter

ppi.plot_network()

(<Figure size 320x320 with 1 Axes>, <AxesSubplot: >)

enter image description here

STRING ppi • 311 views
ADD COMMENT
0
Entering edit mode
  1. Stop using bioinformatics as a tag.
  2. Are you the same person as this user: Starlitnightly ?
ADD REPLY

Login before adding your answer.

Traffic: 1861 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6