Question: Protein Secondary Structure
0
gravatar for yonatanam
4 months ago by
yonatanam0
yonatanam0 wrote:

Hi all , part of my HW i should implement a software to Predicting Protein Secondary Structure . I am looking for database of protein sequence and their Secondary Structure. i tried to look in PDB but i didn't find there. can anyone help please? thanks.

protein • 154 views
ADD COMMENTlink modified 4 months ago by genomax68k • written 4 months ago by yonatanam0
2
gravatar for genomax
4 months ago by
genomax68k
United States
genomax68k wrote:

Every PDB entry includes information about the secondary structure (e.g. 2ZOQ, click on sequence tab). If you download the associated PDB format file it will have information about secondary structure in text format. A partial example below.

HELIX   32  32 PRO B  356  THR B  368  1                                  13    
HELIX   33  33 ALA B  369  GLN B  372  5                                   4    
SHEET    1   A 5 TYR A  42  GLY A  49  0                                        
SHEET    2   A 5 MET A  55  ASP A  61 -1  O  VAL A  56   N  ILE A  48           
SHEET    3   A 5 THR A  66  ILE A  73 -1  O  LYS A  72   N  MET A  55           
SHEET    4   A 5 VAL A 118  ASP A 123 -1  O  GLN A 122   N  ALA A  69           
SHEET    5   A 5 ASP A 105  LEU A 107 -1  N  ASP A 105   O  VAL A 121           
SHEET    1   B 3 THR A 127  ASP A 128  0                                        
SHEET    2   B 3 LEU A 172  ILE A 174 -1  O  ILE A 174   N  THR A 127           
SHEET    3   B 3 LEU A 180  ILE A 182 -1  O  LYS A 181   N  LEU A 173
ADD COMMENTlink modified 4 months ago • written 4 months ago by genomax68k

is it possible to download multiple sequence in one shoot? i am not familiar with the protein's name and thanks for your answer

ADD REPLYlink written 4 months ago by yonatanam0

Yes it is possible to download multiple PDB ID's. Use the download tool here or their FTP site.

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax68k

should i use"Download: Sequences"? moreover i am not familiar with protein's names. I meant if is there any groups of proteins that i can download thanks again for the quick response

ADD REPLYlink written 4 months ago by yonatanam0

What does your assignment exactly ask you to do? Do you have a method you are going to use to do the sec structure predictions with or are you just going to parse the information already there in 140K+ proteins in PDB. Or you are going to learn (ML?) from known PDB structures and then predict for any other sequence?

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax68k

i need to create Predicting Protein Secondary Structure by a SVM. basically i have a paper (http://airccse.org/journal/ijsc/papers/2112ijsc06.pdf) that describing how they did it and i need to implement their solution to practice SVM algorithm.

ADD REPLYlink written 4 months ago by yonatanam0

This is a pretty poor discription of how the sequences were selected

3.1.1. Content of database More than 300 files were produced, one for each PDB protein from the fall 1989 release of PDB with release 12 of EMBL/Swissprot (12305 sequences). This corresponds to derived structures for 3512 proteins or protein fragments; 1854 of these a re homologous over a length of at least 80 residues. Some of these proteins are very similar to their PDB cousin, differing by as little as one residue out of several hundred.

You could go to the FTP site and download randomly from folders there and stop after you have enough candidates?

ADD REPLYlink modified 4 months ago • written 4 months ago by genomax68k

they used The RS 126 data set -i didn't saw this line before. i will search this specific data set and i will try to figure how the prediction is working. thank you very much for your time!!!!!

ADD REPLYlink written 4 months ago by yonatanam0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1499 users visited in the last hour