PDB file pairwise alignment software choice
1
1
Entering edit mode
2.4 years ago
Anand Rao ▴ 630

I need to pairwise align protein structures.

Specifically, my task requires

  • alignment of structure predictions in PDB format, for EACH of ~1000 full length proteins

versus

  • X-ray crystallography based structure PDB files for Pfam/hmmsearch delimited domain only regions of EACH of these 6 (shorter domain) sequences.

This means ~1000*6 = 6,000 pairwise structure alignment runs.

Which software do you recommend from Wiki or elsewhere, with my following requirements:

  1. I should be able to download and run on my local laptop or university HPC, and not on a webserver (unless batch submission and download are allowed).
  2. The runtime, diskspace and RAM needed for each of my 6,000 pairwise alignment need to be reasonable for either my local laptop or on my univ's HPC (with ~2GB RAM nodes and 12cpu limit on my user account)
  3. I need to be able to parse the alignment results programmatically (using Perl or Python scripting...)

FINAL GOAL = Classify and/or rank each of the full-length proteins based on likelihood of domainĀ of interest being "present", not using pairwise sequence alignments, but from reported metrics of pairwise structure alignments (rather than my arbitrary yes/no/maybe classification)

If your suggested bioinformatics pipeline has been published, better yet!

Thank you all, in advance. Cheers!

structure pairwise parsing alignment classification • 2.3k views
ADD COMMENT
0
Entering edit mode

I need to do something similar. What software did you end up choosing?

ADD REPLY
3
Entering edit mode
2.4 years ago
Mensur Dlakic ★ 27k

These links will hopefully be useful as references:

I have tried locally TM-align, Fr-TM-align, Matt, FATCAT, MAMMOTH, SSAP, FAST, THESEUS, and probably couple of others that I have forgotten. All of them used to be available as independent binaries, but for some of them that may not be the case due to age or general lack of support. All of them are reasonably fast, so I don't think that would be a limiting factor for you no matter what you choose. They have various advantages and disadvantages, and also their outputs are different in terms of information provided. I suggest you try several programs and see what functionality is most useful for your purpose.

ADD COMMENT
0
Entering edit mode

Thanks for these links. I will look into running the local versions of some of the tools you've listed.

I vaguely remember reading a paper or blog saying RMSD is OK, but perhaps not the best metric any more... So, are there any thoughts from the structural biology community in terms of new types of metrics used to report structural similarity?

And finally, just because I can run a program to structurally align 2 PDBs, may not mean performing the run is even meaningful. In other words, are there cases where even trying to align 2 PDBs would be considered scientifically absurd, let alone reporting it or using it for ranking / classification? Your thoughts? TIA! :)

ADD REPLY
2
Entering edit mode

I think RMSD is fine when < 2 angstroms, as that is a fairly reliable indicator of structural similarity. But there are certainly cases where RMSD is artificially large because of small subdomain movements, yet the proteins are related.

Below is an example of two enzymes that align over 84.6% of their residues, but with RMSD of 4.88 angstroms.

enter image description here

If you go by RMSD, this would be a tough sell to convince others that they are related. However, when you actually isolate just the catalytic residues within the active site, it is more obvious that they are related.

enter image description here

One has to look beyond the RMSD value or sequence identity in some cases. That means looking at fractions of proteins that are aligned, similarity of global folds, catalytic residues, etc.

I don't think aligning any PDBs would be considered absurd. There aren't very many people in the world who can look at two protein structures and tell you instantly whether they are related or not. It never hurts to let the computer align them.

ADD REPLY
0
Entering edit mode

Understood, thank you very much!

ADD REPLY

Login before adding your answer.

Traffic: 2748 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6