Question: How To Make Multiple Fasta-->Pdb Conversions For Short Peptides (4 Or 5 Aminoacid Long)
0
gravatar for Onat
7.2 years ago by
Onat0
Germany
Onat0 wrote:

Hi, I would like to perform a virtual screening for tetramers on a protein of interest. For this purpose, I randomly generate thousands of different tetramer peptide sequences in FASTA format and I need to convert them into PDB format. With Swiss PDB Viewer, it is only possible to perform the FASTA-->PDB conversion one by one as the program does not allow to upload more than one sequence at once. I am looking for a script or Unix commands to perform multiple FASTA-->PDB conversions but I have not found a solution yet. So I am wondering if it is possible to convert FASTA files into PDB format in a script, or can I execute the "load sequence from aminoacids" and then "save current layer" functions by using for instance shell.exec () function? I appreciate your help. Thank you. Best Regards

fasta pdb conversion short • 8.9k views
ADD COMMENTlink modified 3.3 years ago • written 7.2 years ago by Onat0

I know this conversation happened a long time ago, but this is the first time i have ever used Swiss PDB viewer and i actually would like to know how to take a FASTA sequence and convert it to a pdb.  I realize you guys are on a totally different level, lol, based on the fact that you made this comment "With Swiss PDB Viewer, it is only possible to perform the FASTA-->PDB conversion one by one as the program does not allow to upload more than one sequence at once."  that is exactly what i want to do, I just want to convert ONE FASTA sequence to a PDB file so I can run a simulation with it...I have no idea how tho...any help would be appreciated...

ADD REPLYlink written 4.9 years ago by cpapamit0

Hi cpapamit, open a new question please, easier to answer and also for others to find it later. Also, use the search function in the forum to get older questions with information related to your problem. Try looking for 'protein structure prediction'.

ADD REPLYlink written 4.9 years ago by João Rodrigues2.5k
1
gravatar for João Rodrigues
7.2 years ago by
João Rodrigues2.5k
Stanford University, U
João Rodrigues2.5k wrote:

You can use Pymol with the build_seq.py script.

EDIT: Actually, you can easily build a pipeline using Biopython to parse the FASTA files (here), then pass this to pymol and create the 3D structure in whatever secondary structure you want (coil for you I guess) using the build_seq.py script. This should be a 20 line script or so, we use it in the lab to generate peptide structures from sequence.

ADD COMMENTlink modified 7.2 years ago • written 7.2 years ago by João Rodrigues2.5k

Thanks a lot. But I am wondering what the output is. I need PDB files to do the virtual screening.

ADD REPLYlink written 7.2 years ago by Onat0

PDB files of course. Pymol will build you the PDB file.

ADD REPLYlink written 7.2 years ago by João Rodrigues2.5k

And i could not load the fasta files into PyMol, how can i do this? thank you.

ADD REPLYlink written 7.2 years ago by Onat0

By using the script I can generate secondary structures by typing "build_seq SLGQ, ss=helix" for example but i could not upload fasta files to Pymol. i need to this for thousands of different tetramers

ADD REPLYlink written 7.2 years ago by Onat0

Use biopython to parse the fasta files, pass the seq info to pymol and then output the pdb file. It's not a one program solution, but it's very good and very simple.

ADD REPLYlink written 7.2 years ago by João Rodrigues2.5k

Hello again, I have managed to do the line by line parsing of sequences and I am wondering how I can do the save as pdb step. manually it is just save .pdb, first_residue but i could not do inside my script. how can i reach the .pdb objects formed after running the build_seq command?

ADD REPLYlink written 7.2 years ago by Onat0

I would refer to the PymolWiki for this kind of questions. For a simple introduction to the Pymol API check this post I made a while ago. It should get you on the right path.

http://www.pymolwiki.org/index.php/Save#PYMOL_API http://doeidoei.wordpress.com/2009/02/11/pymol-api-simple-example/

ADD REPLYlink written 7.2 years ago by João Rodrigues2.5k

The following script works for me to convert multiple sequence FASTA to PDB files, I hope it works for someone else. The file.fasta contains all the fasta sequences. This script is saved with .pym extension. The build_seq.py and seq_convert.py files can be saved in the PyMol directory.

import build_seq
from Bio import SeqIO
document = "_document"
for seq_record in SeqIO.parse("file.fasta", "fasta"):
    build_seq.build_seq(seq_record.seq)
    cmd.select(document,"all")
    cmd.save seq_record.id+".pdb", document, -1, 'pdb')
    cmd.delete("all")
ADD REPLYlink modified 3.1 years ago • written 3.1 years ago by jelt00240
0
gravatar for Onat
7.2 years ago by
Onat30
Onat30 wrote:

but Pymol can create pdb file after get the command "build_seq " ",ss=helix" and I should do this inside a loop for thousands of sequences. and i need to save them as separate pdb files.

ADD COMMENTlink written 7.2 years ago by Onat30
1

Ok now I am becoming familiar with Phyton. I hope I can write the necessary script soon. Thank you.

ADD REPLYlink written 7.2 years ago by Onat30

Pymol allows for python scripting. Therefore, as I said before, it is very easy to feed sequences in a for loop to pymol and have it output the corresponding structures. The first answer I gave you had all the links necessary to write the script you want.

ADD REPLYlink written 7.2 years ago by João Rodrigues2.5k
0
gravatar for Woa
7.2 years ago by
Woa2.7k
United States
Woa2.7k wrote:

I'm just curious about one thing: Do you need to generate all the possible 20^4 (1,60000) possible tetra peptides and get their structures from Seq->PDB, ( with an energy minimization possibly), or, you wish to get all the tetra peptide structures that can be be found in protein structures that are available in (non-redundant,highresolution ) PDB?

ADD COMMENTlink written 7.2 years ago by Woa2.7k

On topic, Molsoft ICM can do that along with energy minimization with its scripting engine, but that's commercial, and I think you've already figured it out with Pymol

ADD REPLYlink written 7.2 years ago by Woa2.7k

I want to have the structures and thus PDB files for each of the tetramers. Pymol is capable to do this job actually. Thanks

ADD REPLYlink written 7.2 years ago by Onat30

Hi Onat, I'm facing the same situation that you pass when you ask here about building peptide 3D structure from short peptide sequences. I'm a begginer in programing(python) and bioinformatics. I have some questions about the details of the operations describe here:

What tipe of output from parsing can I do? I think I must to separate the sequences from the headers and save the sequences in a new file..this is right?

After I have to parsing to write before my sequence "build_seq SLGQ,ss=helix" ..this is right?

So I must to copy all the text in file and paste in the pymol comand line.. is this it?

Thanks in advance

ADD REPLYlink written 3.4 years ago by jgribeiro30
0
gravatar for Onat
3.3 years ago by
Onat0
Germany
Onat0 wrote:

Hi Jgribeiro3,

Sorry for the late reply, I have just seen your question. Firstly I created random short peptide sequences by R programming. The random peptide sequences were written in a txt. file. By using the "build_seq" script, I was able to create short peptide sequences written in the txt. file. I created the PDB structures for each sequence afterwards. So you can write a short python script to read the sequences from txt.file and to run the "build_seq" script automatically for each peptide sequence.

I hope this answer would be helpful.

ADD COMMENTlink written 3.3 years ago by Onat0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1494 users visited in the last hour