Hey everyone,
only Greengenes does species level taxonomic classification. Other custom 16S databases of Kraken2 (SILVA and RDP) doesn't do that. It's because they prepared the database for genus level. I want to work with SILVA. Greengenes does the job but with low percentage attached to a taxon. I have worked to fix that. I even tried to create the taxonomy from SILVA 138.2 library through this website of theirs : https://www.arb-silva.de/no_cache/download/archive/release_138_2/ . Hint : You can check tax_slv_ssu_138.2.txt . It reaches to Genus level. But you can see from the headers ">" they have species level reads in the library fasta file: SILVA_138.2_SSURef_NR99_tax_silva.fasta
I'm new at this area so i don't have enough knowledge to fix it. Does anyone else had this problem? Did you managed to solve it. How can i solve this problem.
Most of them are for shotgun. For 16S there is 3 database as I said. They don't do species level classification as good as shotgun ones. This why I tried to create a custom database...
Can you just use nt? Or grab the species level info from nt that you get hits for in the 16S databases and remap?
I would expect the 16S databases are just a subset of nt, but maybe this thought is naive.
I have tried. Problem is in SILVA's naming and coding the seqeunces. They have code and species level in one of their file which is from NCBI. But what they prefered to do is genus level. I also tried to change genus level naming file by adding species level sequences and their new code just like CODE.1 CODE.2 CODE.3 for Genus;xxxx species. It didn't work.