Hello How could i get features like strand, domain,helix,turn,chain,mass,glycosylation,Active site,binding site,.... from protein sequence , because i wanna do a classification based on these features but i've no idea about what these features mean and i could not find a dataset, all i have is protein sequence file , is there any python library or some articles can help me. Cordially
If you only have protein sequences, you'll need to first perform a blast against swissprot database. Then, you take the best hit and fetch the uniprot id (id mapping tools from uniprot). To match coordinates of your protein with the one linked with the uniprot id the best is to perform an global pairwise alignment (see needle).
It would be smart to restrict your blast database to one species which would show the best annotation in terms of features...