How to loop through to get the polypeptide sequences of multiple structures using Biopython Bio.PDB module.
0
0
Entering edit mode
5.2 years ago
M.O.L.S ▴ 100
structureA = PDBParser().get_structure("2rheH", " 2rheH")
PolypeptideBuilder = PPBuilder()
for pp in PolypeptideBuilder.build_peptides(structureA):
print(pp.get_sequence())

>>> ESVLTQPPSASGTPGQRVTISCTGSATDIGSNSVIWYQQVPGKAPKLLIYYNDLLPSGVSDRFSASKSGTSASLAISGLESEDEADYYCAAWNDSLDEPGFGGGTKLTVLGQPK

structureB = PDBParser().get_structure("3ebxH", "3ebxH") 
PolypeptideBuilder = PPBuilder()
for pp in PolypeptideBuilder.build_peptides(structureB):
print(pp.get_sequence())
>>> RICFNHQSSQPQTTKTCSPGESSCYHKQWSDFRGTIIERGCGCPTVKPGIKLSCCESEVCNN

structureC = PDBParser().get_structure("1fxdH", " 1fxdH")
PolypeptideBuilder = PPBuilder()
for pp in PolypeptideBuilder.build_peptides(structureC):
print(pp.get_sequence())
>>> PIEVNDDCMACEACVEICPDVFEMNEEGDKAVVINPDSDLDCVEEAIDSCPAEAIVRS

I am using Biopython's Bio.PDB to get the protein sequences of PDB files.

This is the code that I have used in order to get three sequences.

I would like to find out if there is a way that I could loop this operation in order to get all three sequences printed out from the three different files at once?

Bio Python Polypeptide Bio.PDB • 2.0k views
ADD COMMENT
1
Entering edit mode

Do you actual need peptide chain sequences or do you want the full protein sequence?

The easy way to loop your IDs is just to throw the whole thing inside a loop of names:

for ID in ["2reH", "3ebxH", ...]:
    structure = PDBParser().get_structure(ID, ID)
    PolypeptideBuilder = PPBuilder()
    for pp in PolypeptideBuilder.build_peptides(structure):
    print(pp.get_sequence())

You might want to check out this previous thread.

ADD REPLY

Login before adding your answer.

Traffic: 2540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6