Question: How percentage of identity is calculated ?
0
gravatar for doinelpierrot
13 days ago by
doinelpierrot0 wrote:

Hello all,

I'm using Plastx (some faster equivalent of Blastx) to compare my 500k contigs to ncbi database in order to check some suspected unknown contamination. I would like to only select the results with a percentage of identity > 95%. However there is no such "percentage identity colum" in the output file.

I have some others info about HSP_identity, intensity and alignment. So I was wondering if it is possible to calcultate this percentage myself.

Thanks !

alignment • 86 views
ADD COMMENTlink modified 13 days ago • written 13 days ago by doinelpierrot0
0
gravatar for Mensur Dlakic
13 days ago by
Mensur Dlakic6.9k
USA
Mensur Dlakic6.9k wrote:

I have never used the program you refer to, so my answer is based only on the information you provided.

HSP_identity is likely what you want. HSP usually stands for high-scoring segment pairs, which implies that Plastx is a local aligner like BLAST. If so, HSP identity may refer only to fragments of the whole alignment, in which case it may not be equivalent to global identity. Think of it this way: if there is a single HSP per sequence, it is probably safe to assume that HSP identity is the same as overall (global) identity. If there are multiple HSPs per sequence, that would require a second look.

To make your life easier (although the search will likely be slower), you may want to consider using a global aligner. That way the identity obtained will reflect the whole alignment.

ADD COMMENTlink written 13 days ago by Mensur Dlakic6.9k

Thanks for your help ! I'm a bit confused since I have specified -maxhsps 1 in my command and somehow I sometimes get result >100 in the column HSP_identity, more especialy when the HSP e-value is low.

EDIT : I found the answer of my question : on plast, percentage identity = HSP_identity / HSP_align_length. It's sometimes slightly different from blast results due to the calculation of the alignment length that can vary up to 2bp.

ADD REPLYlink modified 10 days ago • written 13 days ago by doinelpierrot0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1572 users visited in the last hour