How Does One Decide Which Bacterial Taxonomic Classification To Use: Rdp, Silva Or Greengenes?
9.5 years ago
ARich ▴ 100

Hello Biostar,

Can anyone tells me whats the major differences between silva, RDP and green genes taxonomy files except their implementation. And which classify better and why?

Thanks ARich

what do you mean by "taxonomy files except their implementation"? are you talking about the structure of the files themselves or how the databases define their taxonomy?

Hello Thanks for the reply. The question is simple why the preference for one over other. Is the structure varies means one classify in better way over other. if yes which one? and why? Actually I have just started working on Microbiome and I'm confused which one to use for Classification.

I have edited the title, you may also want to edit the body - your sentence "except their implementation" is very confusing. Just keep it simple.

I think this question requires a little more care from you, ARich, given that an answer might take a considerable amount of work.

9.5 years ago
satanicodr ▴ 160

RDP uses the Bergeys taxonomy, which is more conservative and standard. Silva and Greengenes use their own taxonomies which are developed by their own teams and collaborators. For some groups, collaborations have improved the taxonomy and may help you if you are working with a specific groups of organisms. E.g. Silva has a nice eukaryotic taxonomy. Another key difference is that with RDP the lowest taxonomy level is genus whereas you can go to species and strain. Having said that, I will not have a lot of confidence working at strain level if I have a 100 bp read. If you are working with the human microbiome, the thee of them will do a good work since the human-related microorganism are better studied that some rare phyla.

It is possible to map the silva entries to the standard taxonomic classification/ids used by NCBI and EBI. In the archive section, for each release there is a directory called "taxonomy". In here you can find a file that contains the name "taxmap_embl". This file tells you for each silva id the corresponding taxid according to NCBI/EBI.

HI crespialba, Would you mind to clarify where that taxmap_embl file is? I couldn't find it in both Silva and NCBI databaases.

9.5 years ago
cts ★ 1.7k

Just to add to satanicodr's answer, one of the main differences between Greengenes and Silva is the way in which the tree is built. Greengenes construct a de novo tree; Silva use a seed tree and add extra sequences into it parsimoniously. There is a tradeoff between these two methods in that a de novo tree should give the most accurate topology of the sequence data however it is more sensitive to poor quality sequences or chimeras.

Differences in taxonomy arise due to differences in the topology of the tree causing sequences to be grouped differently into monophyletic groups. This is also compounded by regions of the tree that are only known by environmental sequences where the names are up to whoever is curating the tree and naming these groups. I known that the people behind both Greengenes and Silva are working on trying to standardise the naming in these instances.

Greengenes or Sliva will serve you well but I use greengenes because I prefer the de novo tree construction (and because one of my PhD advisors is the curator).

7.4 years ago
-_- ★ 1.0k

I have found many error records with wrong taxonomy in SILVA database, and I don't feel they're of very high quality in any sense. The author also told there is no way to systematically remove those entries yet. The library location is http://www.arb-silva.de/fileadmin/silva_databases/release_119_1/Exports/SILVA_119.1_SSURef_Nr99_tax_silva.fasta.gz.

>EU661378.1.1449 Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;Klebsiella pneumoniae
>HM461153.1.1464 Bacteria;Proteobacteria;Gammaproteobacteria;Pseudomonadales;Pseudomonadaceae;Pseudomonas;Enterobacter sp. enrichment culture clone HSL30