categorising proteins into families when you have amino acid sequences
2
0
Entering edit mode
8.8 years ago

How can I categorise my proteins into families , when I only have amino acid sequences, I have looked all over the internet for tools or any thing I could use. Mostly the advise I.Ds, but these are newly sequenced genes.

Any ideas?

Well I have just sequenced and assembled this bacteria strain, so I used basys annotation pipeline to annotate the scaffold, here I am with a set of 6000 proteins, but I need to know how many of these are involved in for instance dna metabolism, carbohydrate metabolism, etc I have used blastp to identify some interesting genes. I realised another pipe line mg rast but I have already used some of the genes annotated by the previous pipeline in a publication as not its very hard to change to a different pipeline, otherwise advise.

protein • 1.9k views
ADD COMMENT
0
Entering edit mode
8.8 years ago
h.mon 35k

If you have only the amino acid sequences of the proteins, there is not much you can do. You may try to categorize your proteins on Pfam, Prosite, InterPro, SuperFamily, CATH. Also, Blasp may help as well.

edit: based on your update question, may I suggest the RAST server for annotation? It will generate the summaries you want. I never used BASys, but from its description it also outputs the information you want, you probably just have to parse the annotation. For example, the GO ID for "DNA metabolic process" is GO:0006259, you just have to find and count how many genes have this tag on the annotation.

ADD COMMENT
0
Entering edit mode

Thanks

ADD REPLY
0
Entering edit mode

Hello, the pipeline i used generates a file with Go terms per genes as in "https://www.basys.ca/server1/basys/gallery/example_1/html/BASYS00002.html", but it does not tell how many would be a a certain group like Dna metabolism, i consulted with the guys at the BASYS pipeline and they said i would have to use a script to do that. am a beginner at scripting, would you have any idea how to go about this approach?

ADD REPLY
0
Entering edit mode
8.8 years ago
Christian ★ 3.0k
Start here How To Cluster Sequences Based On Blast Results? and look into CLANS, that might be what you are looking for. Blast2GO is possibly an alternative.
ADD COMMENT

Login before adding your answer.

Traffic: 2346 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6