How can I categorise my proteins into families , when I only have amino acid sequences, I have looked all over the internet for tools or any thing I could use. Mostly the advise I.Ds, but these are newly sequenced genes.
Any ideas?
Well I have just sequenced and assembled this bacteria strain, so I used basys annotation pipeline to annotate the scaffold, here I am with a set of 6000 proteins, but I need to know how many of these are involved in for instance dna metabolism, carbohydrate metabolism, etc I have used blastp to identify some interesting genes. I realised another pipe line mg rast but I have already used some of the genes annotated by the previous pipeline in a publication as not its very hard to change to a different pipeline, otherwise advise.
Thanks
Hello, the pipeline i used generates a file with Go terms per genes as in "https://www.basys.ca/server1/basys/gallery/example_1/html/BASYS00002.html", but it does not tell how many would be a a certain group like Dna metabolism, i consulted with the guys at the BASYS pipeline and they said i would have to use a script to do that. am a beginner at scripting, would you have any idea how to go about this approach?