Hello I need to do a taxonomic classification of viral metagenomic sequences, I installed kaiju : http://kaiju.binf.ku.dk/
Kaiju provides of standard databases, one them corresponding to viruses from the NCBI RefSeq database (0.49M of sequences), the issue is that using that database I got really low classification rate (2.95% of the paired ended reads were classified); as I'm dealing with viral genomic data, I think that it will be necessary allow kaiju to accept more mismatches given the relative mutation rate of viruses (10e-8 to 10e-6 for DNA viruses and 10e-6 to 10e-4 for RNA viruses). On this way, using kaiju the standard mismataches allowed are 3, so should I increase the allowed mismatches? at which number ?
The other reason of the low taxonomic classification is the employed database, taken the problem from this scope, do you recommend me to use another viral database ?
Thanks for reading :)