what information can I obtain from a bunch of proteins IDs?
2
0
Entering edit mode
5.3 years ago
Mo ▴ 920

hello,

Lets say I have a large of Proteins IDs as follows:

sp|A3KN83|SBNO1_HUMAN
sp|A6NHQ2|FBLL1_HUMAN
sp|E9PRG8|CK098_HUMAN
sp|O00115|DNS2A_HUMAN
sp|O00151|PDLI1_HUMAN
sp|O00170|AIP_HUMAN
sp|O00178|GTPB1_HUMAN
sp|O00267|SPT5H_HUMAN
sp|O00303|EIF3F_HUMAN
sp|O00418|EF2K_HUMAN
sp|O00571|DDX3X_HUMAN;sp|O15523|DDX3Y_HUMAN;sp|Q9NQI0|DDX4_HUMAN
sp|O00629|IMA3_HUMAN
sp|O14737|PDCD5_HUMAN
sp|O14744|ANM5_HUMAN
sp|O14802|RPC1_HUMAN
sp|O14929|HAT1_HUMAN
sp|O14979|HNRDL_HUMAN
sp|O15020|SPTN2_HUMAN
sp|O15160|RPAC1_HUMAN
sp|O15213|WDR46_HUMAN
sp|O15235|RT12_HUMAN
sp|O15371|EIF3D_HUMAN
sp|O15372|EIF3H_HUMAN
sp|O43143|DHX15_HUMAN;sp|Q14562|DHX8_HUMAN
sp|O43172|PRP4_HUMAN
sp|O43390|HNRPR_HUMAN
sp|O43395|PRPF3_HUMAN
sp|O43491|E41L2_HUMAN;sp|Q9Y2J2|E41L3_HUMAN
sp|O43768|ENSA_HUMAN
sp|O43776|SYNC_HUMAN
sp|O43865|SAHH2_HUMAN;sp|Q96HN2|SAHH3_HUMAN

What information can I obtain and how ? can you please give me your opinion ?

Thanks

Network protein-protein interaction • 1.4k views
2
Entering edit mode

Your question is a bit vague. It is really helpful to tell us what you want to do, even if you don't know how to do it. What is the biological question you want to answer? Any background on experiment or previous analysis is also helpful.

0
Entering edit mode

all these proteins are from human. I would like to know for example what they share. or whether they are known or not. or I if I can find any information related to these proteins either seperatly or in a combination (interaction). Is there a way to make a map of sharing peptides etc.

5
Entering edit mode
5.3 years ago

A one-liner with http://string-db.org :

tr ";" "\n" < your-list.txt |\
cut -d '|' -f 3 |\
sort | uniq | tr "\n" "#" |\
sed 's/#/%0D/g' |\
xargs -I ID curl 'http://string-db.org/api/tsv-no-header/resolveList?identifiers=$ID' |\ cut -f 2 |\ tr "\n" "#" | sed 's/#/%0D/g' |\ xargs -I ID curl -o output.png 'http://string-db.org/api/image/networkList?identifiers=$ID'


0
Entering edit mode

@Pierre Lindenbaum Nice, thanks !!!

0
Entering edit mode

@Pierre Lindenbaum I have tried to use a Mac terminal to get it run, the image is generated but i cannot open it. (I have tried to get different outputs format but still was the same) Do you have any idea where could be a problem? I think your syntax is for Linux ? I appreciate any comment

0
Entering edit mode

what the output of

file output.png

0
Entering edit mode

I remove this to reduce the spam and make the question more clear

1
Entering edit mode

no this is the 'file' command http://linux.die.net/man/1/file

0
Entering edit mode

opps!! It gave me this

file output.png
output.png: HTML document text

0
Entering edit mode

what is the content of the HTML ?...

0
Entering edit mode

I tried to open it with different text editor, or image editor software. I cannot open it to know what is in there. You can download the output here http://wikisend.com/download/952744/output.png

0
Entering edit mode

mv output.png output.html  and try to open in a browser.

0
Entering edit mode

------- and with a text editor ------

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
</body></html>

0
Entering edit mode

my bad when copy+paste. that was '/image' and not '/imag/' . I've fixed my code.

0
Entering edit mode

Now I get the output as an image. Thanks it works now !!! one more question, your image is generated using entire list I pasted above or you select few of them? because I dont get the same trend as you showed above

0
Entering edit mode

I don't remember, I think that was the whole list.

0
Entering edit mode

You might also consider using the web portal associated with String:

http://string-db.org/

0
Entering edit mode

@Pierre Lindenbaum how can I save the result as .SIF in order to load it to Cytoscape ? I will appreciate your comment

1
Entering edit mode
5.3 years ago

Well, there are many places to get information about proteins. To start, you might check:

https://en.wikipedia.org/wiki/List_of_biological_databases#Protein_sequence_databases

And for protein-protein interactions:

https://en.wikipedia.org/wiki/List_of_biological_databases#Protein-protein_and_other_molecular_interactions

The identifiers you have listed will generally be included in most of the databases on these lists. I guess it will be up to you to decide which is most useful to you based on what your needs are.

0
Entering edit mode

@Sean Davis thanks although it is very general answer

1
Entering edit mode

Kind of a general question, too! Pierre read your mind, though, so hopefully you have what you need.