Question: Diamond results non-specific compared to NCBI Web
gravatar for gwrathe
17 months ago by
gwrathe0 wrote:


I recently downloaded and set up the nr database from NCBI using Diamond. I ran my sequences through using the taxonomic information tags. Using the following command lines:

diamond makedb --in nr.gz --taxonmap prot.accession2taxid.gz --taxonnodes nodes.dmp -d nr diamond blastp -d /srv/scratch/nrDatabase/nr.dmnd -q COG0202.faa --more-sensitive -o matchesCOG0202 -f 102 --id 50 --query-cover 80 -b 25

A significant portion of my sequences were returned as having the NCBI Taxonomy ID '2', for bacteria. When I run those same sequences through NCBI Web Blastp they are returned with very specific hits. Such as 'Deltaproteobacteria bacterium HGW-Deltaproteobacteria-15'. Why would Diamond give me useless results when NCBI Web gives me specific and useful results, especially when they use the same database?

Thank you in advance for any help!

blastp diamond blast ncbi • 507 views
ADD COMMENTlink modified 17 months ago by buchfink140 • written 17 months ago by gwrathe0
gravatar for buchfink
17 months ago by
buchfink140 wrote:

Diamond uses the LCA algorithm for taxonomic classification, which means that not only the top hit is used, but all hits within a 10% range of the best score. This can often lead to unspecific assignments. To get it more specific, use the --top parameter with a lower number, e.g. --top 0 would only use the best hit for the taxonomy assignment.

ADD COMMENTlink written 17 months ago by buchfink140

Thank you buchfink! Much appreciated.

ADD REPLYlink written 16 months ago by gwrathe0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 865 users visited in the last hour