Question: Protein Secondary Structure
0
gravatar for yonatanam
16 months ago by
yonatanam0
yonatanam0 wrote:

Hi all , part of my HW i should implement a software to Predicting Protein Secondary Structure . I am looking for database of protein sequence and their Secondary Structure. i tried to look in PDB but i didn't find there. can anyone help please? thanks.

protein • 366 views
ADD COMMENTlink modified 16 months ago by genomax83k • written 16 months ago by yonatanam0
2
gravatar for genomax
16 months ago by
genomax83k
United States
genomax83k wrote:

Every PDB entry includes information about the secondary structure (e.g. 2ZOQ, click on sequence tab). If you download the associated PDB format file it will have information about secondary structure in text format. A partial example below.

HELIX   32  32 PRO B  356  THR B  368  1                                  13    
HELIX   33  33 ALA B  369  GLN B  372  5                                   4    
SHEET    1   A 5 TYR A  42  GLY A  49  0                                        
SHEET    2   A 5 MET A  55  ASP A  61 -1  O  VAL A  56   N  ILE A  48           
SHEET    3   A 5 THR A  66  ILE A  73 -1  O  LYS A  72   N  MET A  55           
SHEET    4   A 5 VAL A 118  ASP A 123 -1  O  GLN A 122   N  ALA A  69           
SHEET    5   A 5 ASP A 105  LEU A 107 -1  N  ASP A 105   O  VAL A 121           
SHEET    1   B 3 THR A 127  ASP A 128  0                                        
SHEET    2   B 3 LEU A 172  ILE A 174 -1  O  ILE A 174   N  THR A 127           
SHEET    3   B 3 LEU A 180  ILE A 182 -1  O  LYS A 181   N  LEU A 173
ADD COMMENTlink modified 16 months ago • written 16 months ago by genomax83k

is it possible to download multiple sequence in one shoot? i am not familiar with the protein's name and thanks for your answer

ADD REPLYlink written 16 months ago by yonatanam0

Yes it is possible to download multiple PDB ID's. Use the download tool here or their FTP site.

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax83k

should i use"Download: Sequences"? moreover i am not familiar with protein's names. I meant if is there any groups of proteins that i can download thanks again for the quick response

ADD REPLYlink written 16 months ago by yonatanam0

What does your assignment exactly ask you to do? Do you have a method you are going to use to do the sec structure predictions with or are you just going to parse the information already there in 140K+ proteins in PDB. Or you are going to learn (ML?) from known PDB structures and then predict for any other sequence?

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax83k

i need to create Predicting Protein Secondary Structure by a SVM. basically i have a paper (http://airccse.org/journal/ijsc/papers/2112ijsc06.pdf) that describing how they did it and i need to implement their solution to practice SVM algorithm.

ADD REPLYlink written 16 months ago by yonatanam0

This is a pretty poor discription of how the sequences were selected

3.1.1. Content of database More than 300 files were produced, one for each PDB protein from the fall 1989 release of PDB with release 12 of EMBL/Swissprot (12305 sequences). This corresponds to derived structures for 3512 proteins or protein fragments; 1854 of these a re homologous over a length of at least 80 residues. Some of these proteins are very similar to their PDB cousin, differing by as little as one residue out of several hundred.

You could go to the FTP site and download randomly from folders there and stop after you have enough candidates?

ADD REPLYlink modified 16 months ago • written 16 months ago by genomax83k

they used The RS 126 data set -i didn't saw this line before. i will search this specific data set and i will try to figure how the prediction is working. thank you very much for your time!!!!!

ADD REPLYlink written 16 months ago by yonatanam0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2140 users visited in the last hour