to extract fasta file from PDB and obtain the content of file only as protein sequence
1
0
Entering edit mode
4.0 years ago

is there any python code to extract fasta file from PDB for a given protein_id(eg:- 1mkp)

alignment sequence assembly • 9.9k views
ADD COMMENT
2
Entering edit mode
4.0 years ago
Mensur Dlakic ★ 27k
import sys
from Bio import SeqIO

PDBFile = sys.argv[1]
with open(PDBFile, 'r') as pdb_file:
    for record in SeqIO.parse(pdb_file, 'pdb-atom'):
        print('>' + record.id)
        print(record.seq)

Save as pdb-seq.py. Download PDB coordinates for 1mkp and type:

python pdb-seq.py 1mkp.pdb

>1MKP:A
ASFPVEILPFLYLGCAKDSTNLDVLEEFGIKYILNVTPNLPNLFENAGEFKYKQIPISDHWSQNLSQFFPEAISFIDEARGKN
CGVLVHSLAGISRSVTVTVAYLMQKLNLSMNDAYDIVKMKKSNISPNFNFMGQLLDFERTL
ADD COMMENT
0
Entering edit mode

code should be for python 3

for record in SeqIO.parse(pdb_file, 'pdb-atom'): ^ SyntaxError: unexpected EOF while parsing Plase resolve is problem also for me

ADD REPLY
1
Entering edit mode

code should be for python 3

This code works fine with Python 3.6 on my computer. Also, I think you may be under a wrong impression that I should be troubleshooting this even after providing full code for you.

ADD REPLY
0
Entering edit mode

Good day!

Thanks for the script.

It works, but I have a question. How can I know the sequence FASTA of a specific selection of the PDB file? For example, if I have a chain with 100 residues, but I want to know only the first 10 residues FASTA sequence, how can I do that?

Thank you so much.

Regards, Brandon U.

ADD REPLY
0
Entering edit mode

You can modify Mensur Dlakic 's code as follows. This will get you the first 10 AA.

import sys
from Bio import SeqIO

PDBFile = sys.argv[1]
with open(PDBFile, 'r') as pdb_file:
    for record in SeqIO.parse(pdb_file, 'pdb-atom'):
        print('>' + record.id)
        print(record.seq[:10])

Check the [:10] addition that is making this possible. You can use an appropriate interval e.g. [4:24] to get other sections of the sequence.

ADD REPLY

Login before adding your answer.

Traffic: 2037 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6