KASS KEGG annotation
2
1
Entering edit mode
9.5 years ago
h.botond ▴ 50

Hello everybody!

I annotated may protein set with Kaas-Kegg Automatic Annotation server. After the process a have get two result files, a html and a text file.

Can somebody tell me how can I get out the annotations from the html file? In the and I want to make two flat file. A Kegg Orthology and a Brite Hierarchy file. Is the an easy way to do this.

Thank for all helps!

annotation kegg kaas • 4.5k views
ADD COMMENT
0
Entering edit mode

I want to annotate my protein set. For that reason I want to download not only the KO numbers but the annotation to. As I observed it is possible to open all the submenus and copy all the annotations into a txt file but after this I have to reformat the full documernt which is a little awkward. Is there an easier way to do these? To get a table file with my genes and the annotating and the K number.

ADD REPLY
0
Entering edit mode

you are supposed to comment in this box instead of answer box. Check my edit if it answers your question.

ADD REPLY
1
Entering edit mode
9.5 years ago
Prakki Rama ★ 2.7k

Click the html link and click 'exec'. This should generate all the pathways, corresponding genes in that pathway (collapse all), you input proteins hit to. The information which input protein hit to which gene can be found in the text file that you have.

EDIT:

$ cat query.ko
Transg4.t1      K04539
Transg4.t2      K04539
Transg5.t1      K04982
Transg5.t2      K04982
Transg6.t1      K09596

$ cat collapsed.txt

Pathway Search Result

Sort by the number of hits
Hide all objects
ko01100 Metabolic pathways (14)

ko:K00129 E1.2.1.5; aldehyde dehydrogenase (NAD(P)+) [EC:1.2.1.5]
ko:K00411 UQCRFS1; ubiquinol-cytochrome c reductase iron-sulfur subunit [EC:1.10.2.2]
ko:K00710 GALNT; polypeptide N-acetylgalactosaminyltransferase [EC:2.4.1.41]
ko:K01106 E3.1.3.56; inositol-1,4,5-trisphosphate 5-phosphatase [EC:3.1.3.56]
ko:K01132 GALNS; N-acetylgalactosamine-6-sulfatase [EC:3.1.6.4]
ko:K01597 MVD; diphosphomevalonate decarboxylase [EC:4.1.1.33]
ko:K01711 gmd; GDPmannose 4,6-dehydratase [EC:4.2.1.47]
ko:K01772 hemH; ferrochelatase [EC:4.99.1.1]
ko:K02263 COX4; cytochrome c oxidase subunit 4
ko:K04710 CERS; ceramide synthetase [EC:2.3.1.24]
ko:K07419 CYP2R1; vitamin D 25-hydroxylase [EC:1.14.13.159]
ko:K07820 B3GALT2; beta-1,3-galactosyltransferase 2 [EC:2.4.1.-]
ko:K08074 ADPGK; ADP-dependent glucokinase [EC:2.7.1.147]
ko:K13499 CHSY; chondroitin sulfate synthase [EC:2.4.1.175 2.4.1.226]

Using Perl:

open COLLAPSED,"collapsed.txt";

while(<COLLAPSED>)
{
    if($_=~/ko\:(K.+)\s\w+\;\s*(.+\s*\[*.*\]*)\s*/)
    {
    #print "$1,$2";
    $KHash{$1}=$2;
    }
}

open FH,"query.ko";

while(<FH>)
{
    #print $_;    
    if($_=~/(.+)\s+(.+)/ && exists($KHash{$2}))
    {
    print "$1\t$2\t$KHash{$2}";
    }
}

close(COLLAPSED);
close(FH);

Result

$ perl annotatating_Transcripts_UsingKEGG_KAAS.pl
Transg4.t1         K04539    guanine nucleotide-binding protein subunit beta-5
Transg4.t2         K04539    guanine nucleotide-binding protein subunit beta-5
Transg5.t1         K04982    transient receptor potential cation channel subfamily M member 7 [EC:2.7.11.1]
Transg5.t2         K04982    transient receptor potential cation channel subfamily M member 7 [EC:2.7.11.1]

Note: Some transcripts even though have Kegg ID sometimes are not found in the collapsed file.

ADD COMMENT
0
Entering edit mode

This is a really big help! Thanks a lot! I will tree it for sure.

ADD REPLY
0
Entering edit mode

he script is working like a charm but what can I do with the missing KOGs. Have you got any ideas? A lots off proteins which got KOG id is missing from the collapsed list. Why is this possible?

ADD REPLY
0
Entering edit mode

Similar issue raised and addressed in this post. It seems due to poor characterisation, they are not assigned into any pathway. So you do not find them in the collapsed list. Check this comment.

ADD REPLY

Login before adding your answer.

Traffic: 2415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6