blast culling_limit option behavior
0
0
Entering edit mode
4.9 years ago
erwan.scaon ▴ 830

I am using culling_limit 1 as a parameter

From the manual : Delete a hit that is enveloped by at least this many higher-scoring hits

My understanding : The culling limit can be used to remove redundant hits. In practice it sets the number of hits returned per subject sequence

The command line $blastn -query reads.fa -subject locus.fa -strand plus -culling_limit 1 -dust no -out result.csv -outfmt 6

One unexpected result :

qseqid  sseqid  pident  length  mismatch    gapopen qstart  qend    sstart  send    evalue  bitscore
QJLFG:08700:06611   gi|372099098:113208001-113426000    98.131  107 1   1   1   106 51978   52084   5.11E-49    185
QJLFG:08700:06611   gi|372099098:113208001-113426000    79.167  120 14  8   103 215 217412  217527  4.15E-15    73.1
QJLFG:08700:06611   gi|372099098:113208001-113426000    97.561  41  1   0   103 143 217437  217477  1.49E-14    71.3

The "3rd hit" as far as i understand is redundant regarding the "2nd hit" : same subject region, same part of the read involved, but it's a shorter alignement with a higher e-value

Why is it not discarded with culling_limit 1 ?

blastn alignment culling_limit • 3.5k views
ADD COMMENT
0
Entering edit mode

Hum it's pretty hard to read, here is a focus on relevant infos :

format : qstart<->qend --- sstart<->send --- evalue

2nd hit : 103<->215 --- 217412<->217527 --- 4.15E-15

3rd hit : 103<->143 --- 217437<->217477 --- 1.49E-14

ADD REPLY
0
Entering edit mode

Hello,

I appreciate this is an old post, but I am having the same issue. Did you manage to find a solution to this problem? I am searching a large number of similar queries against about 2k genomes, and for some of these target sequences I am getting >50 hits with culling limit of 1. Does culling limit not do what I think it does?

Thanks in advance.

ADD REPLY
0
Entering edit mode

That is an uncommon parameter that I have not personally used but help for that parameter says

culling_limit Delete a hit that is enveloped by at least this many higher-scoring hits.

If you are getting >50 hits then perhaps they are all higher scoring hits. Have you tried to set the parameter to a larger number? Are you looking to keep only one hit?

ADD REPLY
0
Entering edit mode

Hi GenoMax, sorry for the late response. Yes I am looking to only keep one hit.

I am not sure I follow the idea that all the reported hits are higher scoring hits. My interpretation of that definition is that culling limit should remove the hits for which there is a higher scoring hit. If I set it to 1 it should "delete a hit that is enveloped by at least 1 higher scoring hit". Surely this means that only the highest scoring hit would remain?

ADD REPLY

Login before adding your answer.

Traffic: 2842 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6