I have assembled a transcriptome using Trinity and clustering it using CD-hit with 0.95 threshold to reduce redundancy.
For the clustered transcripts, I performed blastx against nr database (restricted to basidiomycota taxon due to computational power). I imported the blast result into blast2GO to retrieve GO terms.
Using the same clustered transcripts, I used Transdecoder (default settings) to predict protein sequences. Then, the protein sequences (Transdecoder outputs) were used as queries in http://weizhong-lab.ucsd.edu/metagenomic-analysis/server/cog/ and http://www.kegg.jp/blastkoala/ for functional annotations for both COG and KEGG.
I have seen others perform blastx on both COG and KEGG database therefore not too sure about my different approaches.
Am I heading toward right direction?