As a computer science PhD student, I have developed a fast protein structure search called RUPEE. It searches among the 309,000 CATH domain structures as of CATH v4.1. It is notable for being extremely fast in comparison to existing protein structure searches while maintaining a good level of precision. The results of most fast structure searches I have seen deteriorate fairly quickly down in the list of ranked results. For this reason, RUPEE may present a viable approach to handling large amounts of structures in the future - into the millions. Another notable aspect of RUPEE is that it is uses no sequence information or pre-calculated results at all and is based on a few simple ideas and still manages to hold up well. For this reason, it may be suitable for incorporation into larger and more complex systems that leverage additional information.
RUPEE is available at:
Code is available at:
https://github.com/rayoub/rupee
The corresponding conference paper can be found at:
http://ieeexplore.ieee.org/document/8217627/
You can also contact me for future details.
I'm extremely interested in feedback. As a computer science student, I can use more insight into how bioinformatics tools are used by practitioners in order to provide something truly useful. I have high regard for the domain of bioinformatics and the important work being done by people in this area and therefore have chosen it as my domain of application. Still, there is much for me to learn.
Keep in mind, this project receives no external funding and it is running on a single server in the AWS cloud. So be kind when clicking.