USEARCH and Vmatch producing different number of hits
0
0
Entering edit mode
9.4 years ago
bioinfo ▴ 840

Hi, I have run Vmatch and Usearch on the same NGS dataset (Illumina 101 bp PE) against a local database of protein sequences and noticed that there is a significant difference in numbers of hits they generate.

I used the following commands:

VMATCH

vmatch -showdesc 0 -dnavsprot 11 -l 20 -identity 90 -h 2 -q input.fasta dbs/database > vmatch.out

Then I took only the best match. Database was made up using mkvtree in Vmatch.

USEARCH

usearch -usearch_global input.fasta -db db.udb -id 0.90 -threads 16 -blast6out usearch.out -maxaccepts 1

In Vmatch I allowed only 2 mismatches as you can see, whereas in USEARCH I allowed to get hits down to 90% identity, still why there is a huge difference in results?

OUTPUTS (Column means: GENE name [TAB] Normalised counts [TAB] RAW hit counts): all hits

VMATCH OUTPUT

ISCR1   7.99383167412687        4054
ISCR14  1.74291497975711        861
ISCR2   50.7803527891896        23294
ISCR3   0.0991902834008098      49
ISCR4   2.81212121212114        1392
ISCR5   0.222003929273085       113
ISCR6   2.48478701825562        1225
ISCR7   0.0078740157480315      3
ISCR8   0.0854368932038835      44
OXA-10  0.00778210116731518     2
SHV     0.00699300699300699     2
aac(3)-IVa      0.00387596899224806     1
aph(3')-Ia      0.0260012252632179      7
catB2/catB3/catB10      0.109523809523809       23
catB9   0.0909090909090909      19
dfrA1/dfrA15    0.203821656050955       32
dfrA12/dfrA13/dfrA21/dfrA22/dfrA33      0.00606060606060606     1
dfrA16  0.00636942675159236     1
dfrA3   0.296296296296297       48
dfrA6/dfrA31    0.00636942675159236     1
dfrC    0.00628930817610063     1
intI1   0.00891090857708068     3
intI10  1.41562500000001        453
intI2   0.624615384615386       203
intI3   0.401734104046244       139
intI6   0.137704918032787       42
intI7   0.161716171617162       49
intI8   0.0379746835443038      12
intI9   0.912757686598001       330
sul2    0.00738007380073801     2
tet(36) 0.0015625       1
tet(39) 0.00253164556962025     1
tet(C)  0.0277777777777778      11
tet(M)  0.00312989045383412     2
tetB(P) 0.380368098159508       248
vanRN   0.00434782608695652     1
vat(C)  0.00471698113207547     1
vat(F)  0.298642533936652       66
vat(G)  0.0833333333333333      18

USEARCH Output

ISCR1   1.24366471734895        638
ISCR2   30.2932796346512        15002
ISCR3   0.0587044534412955      29
ISCR4   0.0565656565656565      28
ISCR6   0.00202839756592292     1
ISCR7   0.0026246719160105      1
ISCR8   0.0737864077669903      38
aph(3')-Ia      0.029520295202952       8
dfrA1/dfrA15    0.152866242038217       24
dfrA3   0.0123456790123457      2
dfrA6/dfrA31    0.00636942675159236     1
intI1   0.00593471810089021     2
intI10  0.209375        67
intI2   0.116923076923077       38
intI3   0.0173410404624277      6
intI6   0.00327868852459016     1
intI8   0.00949367088607595     3
intI9   0.602232901241181       218
tet(39) 0.00253164556962025     1
tet(C)  0.0227272727272727      9
tet(M)  0.00312989045383412     2
vmatch usearch blast NGS • 2.0k views
ADD COMMENT

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6