I was wondering whether bowtie, BWA etc. can map nucleotide reads to protein reference database? or they are just simply DNA aligners? I found one called PAUDA that possibly could be useful but have anyone of you used that before?
Bowtie2, BWA etc. only do DNA-DNA. I don't know about PAUDA, but from the doc, it sounds reasonable.
Update: I thought about reverse translation a bit more and like to revise my original statement - probably not a good idea ;)
(My idea would be to convert protein to pseudo transcripts by translating them to DNA and then try a standard mapper. But of course, there are ambiguity issues regarding the genetic code. Still, a sensitive mapper, for example bwa mem, could work)
The RTG metagenomics tools include a command called mapx which is analogous to (but orders of magnitude faster than) blastx, which we developed for use on the HMP project. It internally translates the DNA reads into amino acids on the possible frames and performs protein alignment against your protein database (including support for protein scoring matrices such as blosum, which your alternative approach would not permit).