Does anyone know how to work with all the taxonomy ranks of the SILVA SSu Parc database?
I randomly grabbed some headers from the fasta file, there is probably a better example but this will work:
>GAJX01000014.610.3074 Eukaryota;Opisthokonta;Holozoa;Metazoa (Animalia);Eumetazoa;Bilateria;Arthropoda;Hexapoda;Insecta;Pterygota;Clavigralla tomentosicollis
>GEGR01000582.223.3660 Eukaryota;Opisthokonta;Nucletmycea;Fungi;Dikarya;Ascomycota;Pezizomycotina;Sordariomycetes;Xylariales;Arthrinium malaysianum
>GBAD01033107.4974.7983 Bacteria;Cyanobacteria;Oxyphotobacteria;Chloroplast;Boechera gunnisoniana
>GBAD01033114.4974.7822 Bacteria;Cyanobacteria;Oxyphotobacteria;Chloroplast;Boechera gunnisoniana
>GBAD01033121.14.3154 Bacteria;Proteobacteria;Alphaproteobacteria;Rickettsiales;Mitochondria;Boechera gunnisoniana
I am only interested in 16S sequences from bacteria but if there is also a solution for eukaryota it is a plus. The problem is that I can not determine the ranks kingdom, phylum, class, order, family, genus, species in that order. The last header in my example contains 6 ranks but the other ones contain 5 ranks. I am missing the order for GBAD01033114.4974.7822 but I can't automate that because the 4th rank is not always the order.
I already tried to use the taxonomy of the NCBI instead of SILVA but a lot of times the species rank "uncultered bacteria" has a genus, family etc. in SILVA but in the NCBI taxonomy there is no taxonomy at all. So that solution did not worked.
How do others use this database, only looking at species level?
Here you can find the file for megan but is for the ref_nr99 http://ab.inf.uni-tuebingen.de/data/software/megan6/download/welcome.html. I go check out that file you are mentioning
cool, thanks, too
Looks like the file that I need! Thanks!