11 months ago by
University of Oxford
Okay, I'll split my answer in two and you are asking two things. (And I am assuming that by "all protein" you mean all the protein you care about. Not, the whole human proteome say...)
To find homologues of a protein BLAST is the best tool. It is a GUI, but there is a command line version to run locally assuming you download all the protein (a lot) or a Restful API, which has many wrappers including in Biopython. Proteins diverge most on the surface so similarity above 70% is still okay for your hits, but lower might be iffy. Lower than 20% and short match is likely junk. Proteins are divided into families and PFam ID are a great resource and appear in uniprot entries for protein, which is one of the best general use DB for protein.
if not and Chembl or in PubChem are working for you, you might need get creative.
When you say you did a virtual screen, did you dock a library and potential cross-validate the scores? If not and instead you split up your molecules by functional group and properties (QSAR kind of thing) and used that data to train a model to predict the bioactivity outcome when available. In which, case, in the case of your target protein where you don't have bioactivity data, you could do docking to see if the data correlates. To do that you might need to make a structural model of your protein if a 3D structure is not available in PDB. You might find a threaded model in Expasy Swissmodel, but if not then using an model generated with ITasser or Phyre (Threaded/ab initio) will be no good for docking.
Alternatively and totally a long shot... if your protein is an enzyme and you just need a few targets, BRENDA is a good resource to find the parameters of the native metabolites and promiscuous activities —which can be used a positive controls for any virtual screen, although metabolites aren't very drug like (sensu Lipinski's rule of five). That is, whereas Uniprot or MetaCyc and other sites just give the physiological substrate, BRENDA will tells you about similar compounds that still bind, but badly.