Protein Secondary Structure
1
0
Entering edit mode
5.3 years ago
yonatanam • 0

Hi all , part of my HW i should implement a software to Predicting Protein Secondary Structure . I am looking for database of protein sequence and their Secondary Structure. i tried to look in PDB but i didn't find there. can anyone help please? thanks.

protein • 1.7k views
ADD COMMENT
2
Entering edit mode
5.3 years ago
GenoMax 142k

Every PDB entry includes information about the secondary structure (e.g. 2ZOQ, click on sequence tab). If you download the associated PDB format file it will have information about secondary structure in text format. A partial example below.

HELIX   32  32 PRO B  356  THR B  368  1                                  13    
HELIX   33  33 ALA B  369  GLN B  372  5                                   4    
SHEET    1   A 5 TYR A  42  GLY A  49  0                                        
SHEET    2   A 5 MET A  55  ASP A  61 -1  O  VAL A  56   N  ILE A  48           
SHEET    3   A 5 THR A  66  ILE A  73 -1  O  LYS A  72   N  MET A  55           
SHEET    4   A 5 VAL A 118  ASP A 123 -1  O  GLN A 122   N  ALA A  69           
SHEET    5   A 5 ASP A 105  LEU A 107 -1  N  ASP A 105   O  VAL A 121           
SHEET    1   B 3 THR A 127  ASP A 128  0                                        
SHEET    2   B 3 LEU A 172  ILE A 174 -1  O  ILE A 174   N  THR A 127           
SHEET    3   B 3 LEU A 180  ILE A 182 -1  O  LYS A 181   N  LEU A 173
ADD COMMENT
0
Entering edit mode

is it possible to download multiple sequence in one shoot? i am not familiar with the protein's name and thanks for your answer

ADD REPLY
0
Entering edit mode

Yes it is possible to download multiple PDB ID's. Use the download tool here or their FTP site.

ADD REPLY
0
Entering edit mode

should i use"Download: Sequences"? moreover i am not familiar with protein's names. I meant if is there any groups of proteins that i can download thanks again for the quick response

ADD REPLY
0
Entering edit mode

What does your assignment exactly ask you to do? Do you have a method you are going to use to do the sec structure predictions with or are you just going to parse the information already there in 140K+ proteins in PDB. Or you are going to learn (ML?) from known PDB structures and then predict for any other sequence?

ADD REPLY
0
Entering edit mode

i need to create Predicting Protein Secondary Structure by a SVM. basically i have a paper (http://airccse.org/journal/ijsc/papers/2112ijsc06.pdf) that describing how they did it and i need to implement their solution to practice SVM algorithm.

ADD REPLY
0
Entering edit mode

This is a pretty poor discription of how the sequences were selected

3.1.1. Content of database More than 300 files were produced, one for each PDB protein from the fall 1989 release of PDB with release 12 of EMBL/Swissprot (12305 sequences). This corresponds to derived structures for 3512 proteins or protein fragments; 1854 of these a re homologous over a length of at least 80 residues. Some of these proteins are very similar to their PDB cousin, differing by as little as one residue out of several hundred.

You could go to the FTP site and download randomly from folders there and stop after you have enough candidates?

ADD REPLY
0
Entering edit mode

they used The RS 126 data set -i didn't saw this line before. i will search this specific data set and i will try to figure how the prediction is working. thank you very much for your time!!!!!

ADD REPLY

Login before adding your answer.

Traffic: 1889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6