Classify, organize, and cluster a list of proteins / discovering relations between a set of proteins and linking them to disease
3.4 years ago
I have a list of proteins (only Uniprot IDs) and I am interested in discovering relations between them.

Most importantly I want to focus on disease pathways and find out whether the proteins I am looking at are part of particular disease(s). Additionally, functional relations, pathway analysis, simple category/class of protein, etc. can be helpful too. But disease pathways are the most important analysis for me.

I have two groups of patients and the proteins of interest are absent in one and present in the other. As an example:

PROTEIN     GROUP1  GROUP2
Q64511      1       0
Q01320      1       1
Q8CIZ8      1       0
Q60865      0       1
Q8VDP4      0       1
P80318      1       0
Q3TXS7      0       1

0 = protein of interest is absent in the group
1 = protein is expressed


Are there any R packages or databases that will let me do this comprehensively?

protein cluster
their names only

What kind of names? Since there is no standard nomenclature this would become an herculean task unless you have some specific needs (standard identifiers (e.g. uniprot) of a certain kind, specific organism).

Fixed now. It was a mistake.

give us a few examples of Uniprot IDs please

"P06909", "Q3UV17", "P07724", "Q61147", "Q6ZPJ3", "P20918", "Q7TT37", "Q62469"

Simplest would be to put the ids in string database , presumably if they belong to same species.

try reactome also@OP . Proteins seem to be from Mus musculus.

This is the critical requirement. So a functional consequence differential analysis between two networks.

I have two lists for two groups, and I want to know if the expression of these proteins in one group and absence in the other indicates any biological phenomenon (mainly disease-related, but as mentioned any other 'classification' can be helpful too).

3.4 years ago

use the uniprot SPARQL Graph to find some relationships between your proteins. I've played a few minutes with it, but I cannot find anything with the example provided (output dataset is empty). https://sparql.uniprot.org/sparql/

PREFIX up:<http://purl.uniprot.org/core/>
PREFIX taxon:<http://purl.uniprot.org/taxonomy/>
PREFIX rdfs:<http://www.w3.org/2000/01/rdf-schema#>
PREFIX skos:<http://www.w3.org/2004/02/skos/core#>
SELECT ?p1 ?p2 ?annotation ?text
WHERE
{
?p1 a up:Protein .
?p2 a up:Protein .
?p1 up:annotation ?annotation .
?p2 up:annotation ?annotation .
?annotation rdfs:comment ?text .

FILTER (?p1 IN (<http://purl.uniprot.org/uniprot/P06909>, <http://purl.uniprot.org/uniprot/Q3UV17>, <http://purl.uniprot.org/uniprot/P07724>, <http://purl.uniprot.org/uniprot/Q61147>, <http://purl.uniprot.org/uniprot/Q6ZPJ3>, <http://purl.uniprot.org/uniprot/P20918>, <http://purl.uniprot.org/uniprot/Q7TT37>, <http://purl.uniprot.org/uniprot/Q62469> ) ).
FILTER (?p2 IN (<http://purl.uniprot.org/uniprot/P06909>, <http://purl.uniprot.org/uniprot/Q3UV17>, <http://purl.uniprot.org/uniprot/P07724>, <http://purl.uniprot.org/uniprot/Q61147>, <http://purl.uniprot.org/uniprot/Q6ZPJ3>, <http://purl.uniprot.org/uniprot/P20918>, <http://purl.uniprot.org/uniprot/Q7TT37>, <http://purl.uniprot.org/uniprot/Q62469> ) ).
FILTER( STR(?p1) < STR(?p2))
}

