I would like to know the best strategy to get the highest amount of GO terms for the bacterial proteome I'm working. Since it's a non-model organism, I will build the GO database from scratch.
I obtained 60% GO annotated proteome BLASTing to bacterial nr protein database (retaining first 20 hits), but some of them are very general. Same results were obtaining with BLAST2GO InterPro mapping.
I've been thinking to BLAST against uniprot and nr databases and merge results. Also, I would like to know how many hits should I retain from BLAST searches.