matching PDB and PFAM sequences for contact mapping
0
0
Entering edit mode
2.5 years ago
Evan • 0

I am trying to generate a contact prediction from PFAM MSAs but I need to reliably map a given protein family sequence (a specific sequence from the MSA from PFAM) with its corresponding PDB sequence.

Take as an example PF00011:

The PFAM reference sequence is: ['D' 'W' 'K' 'E' 'T' 'P' 'E' 'A' 'H' 'V' 'F' 'K' 'A' 'D' 'L' 'P' 'G' 'V' 'K' 'K' 'E' 'E' 'V' 'K' 'V' 'E' 'V' 'E' 'D' 'G' 'N' 'v' 'L' 'V' 'V' 'S 'G' 'E' 'R' 'T' 'k' 'e' 'K' 'E' 'D' 'K' 'N' 'D' 'K' 'W' 'H' 'R' 'V' 'E' 'R' 'S' 'S' 'G' 'K' 'F' 'V' 'R' 'R' 'F' 'R' 'L' 'L' 'E' 'D' 'A' 'K' 'V' 'E' 'E' 'V' 'K' 'A' 'G' 'L' 'E' 'N' 'G' 'V' 'L' 'T' 'V' 'T' 'V' 'P' 'K' 'A' 'E' 'V' 'K' 'K' 'P' 'E' 'V' 'K' 'A' 'I' 'Q' 'I' 'S']

... and loading the PDB sequence using the PFAM-provided PDB-id '2BYU' I get the following sequence: ['N', 'A', 'R', 'M', 'D', 'W', 'K', 'E', 'T', 'P', 'E', 'A', 'H', 'V', 'F', 'K', 'A', 'D', 'L', 'P', 'G', 'V', 'K', 'K', 'E', 'E', 'V', 'K', 'V', 'E', 'V', 'E', 'D', 'G', 'N', 'V', 'L', 'V', 'V', 'S', 'G', 'E', 'R', 'T', 'K', 'E', 'K', 'E', 'D', 'K', 'N', 'D', 'K', 'W', 'H', 'R', 'V', 'E', 'R', 'S', 'S', 'G', 'K', 'F', 'V', 'R', 'R', 'F', 'R', 'L', 'L', 'E', 'D', 'A', 'K', 'V', 'E', 'E', 'V', 'K', 'A', 'G', 'L', 'E', 'N', 'G', 'V', 'L', 'T', 'V', 'T', 'V', 'P', 'K', 'A', 'A', 'I', 'Q', 'I', 'S', 'G']

both sequences are nearly identical with the exception of the additional 'N', 'A', 'R', 'M' at the beginning of the pdb sequence. Is their some reference that allows us to extract the exact-matching sequence from the PDB database?

Thanks in advance, Evan

PDB sequence prediciton PFAM contact • 581 views
ADD COMMENT
0
Entering edit mode

I don't know what MSA you plan to use - Pfam has several of them for each family - but they may not be diverse enough for reliable contact prediction.

ADD REPLY

Login before adding your answer.

Traffic: 2813 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6