Question: Diamond results non-specific compared to NCBI Web
0
gravatar for gwrathe
14 days ago by
gwrathe0
gwrathe0 wrote:

Hello,

I recently downloaded and set up the nr database from NCBI using Diamond. I ran my sequences through using the taxonomic information tags. Using the following command lines:

diamond makedb --in nr.gz --taxonmap prot.accession2taxid.gz --taxonnodes nodes.dmp -d nr diamond blastp -d /srv/scratch/nrDatabase/nr.dmnd -q COG0202.faa --more-sensitive -o matchesCOG0202 -f 102 --id 50 --query-cover 80 -b 25

A significant portion of my sequences were returned as having the NCBI Taxonomy ID '2', for bacteria. When I run those same sequences through NCBI Web Blastp they are returned with very specific hits. Such as 'Deltaproteobacteria bacterium HGW-Deltaproteobacteria-15'. Why would Diamond give me useless results when NCBI Web gives me specific and useful results, especially when they use the same database?

Thank you in advance for any help!

blastp diamond blast ncbi • 85 views
ADD COMMENTlink modified 14 days ago by buchfink140 • written 14 days ago by gwrathe0
4
gravatar for buchfink
14 days ago by
buchfink140
buchfink140 wrote:

Diamond uses the LCA algorithm for taxonomic classification, which means that not only the top hit is used, but all hits within a 10% range of the best score. This can often lead to unspecific assignments. To get it more specific, use the --top parameter with a lower number, e.g. --top 0 would only use the best hit for the taxonomy assignment.

ADD COMMENTlink written 14 days ago by buchfink140

Thank you buchfink! Much appreciated.

ADD REPLYlink written 13 days ago by gwrathe0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 545 users visited in the last hour