Hi all,
I'm analyzing WGS of strain using Prokka and I got .gff and .faa (Protein FASTA file of the translated CDS sequences) files from it. And I'm not sure whether what I'm doing is right...
So, many of "hypothetical protein" annotated from Prokka are "well.. I know this guy is a protein hypothetically, but I don't know what it is exactly following my database", right? Then if I map the proteins amino acid sequences in KEGG using BlastKoala, the reason why I can get specifically annotated pathways and proteins is because those hypothetical proteins do actually have identified functions and names in database KEGG is using???
I'd like to answer the question, "if you map with hypothetical proteins, how do you know they are engaged in different KEGG pathways and all they are actually annotated?"
Thank you in advance :)