How to obtain only one target sequence for each query read using blast+?
1
0
Entering edit mode
6.7 years ago
vitor.eca • 0

Hi, I´m using blast+ (blastn) to do a local blast with a database I´ve downloaded. As a result I would like to have only one hit per each read I have, so, I´m using the option "max_target_seqs 1", it happens that sometimes I get more than one hit for each of my query sequences, for what I understand my query reads are matching with the same target sequence in different forms. Anyone knows an option that could give me only one hit? I´ve tried the option max_hsps 1 but this allows each target sequences to be hit only one time, which I don´t want, once several of my reads may probably correspond to the same target read (in my case, the same species).

assembly • 2.0k views
ADD COMMENT
1
Entering edit mode

ha, tricky question nowadays ;) (google for other posts here on biostar with the max_target_seq as keyword)

Long story short: you're better of running the blast with default values and then filter out only a single hit for each query in post-processing. Have a look here, 'trick 6' : https://www.cheatography.com/melissamlwong/cheat-sheets/awk-one-liners-for-blast-results-manipulation/ (haven't tested that one myself though)

You mention 'read', is that read as in NGS or something different? If the former: blast is probably not the best approach then (unless you only have a very limited amount of reads to process)

ADD REPLY
0
Entering edit mode

Thanks lieven.sterck :)

ADD REPLY
0
Entering edit mode
6.7 years ago
h.mon 35k

Don't -max_target_seqs 1 -max_hsps 1 together do what you want?

But you should be aware -max_target_seqs 1 isn't a filter after all alignments have been found, rather, it changes the heuristics of the algorithm - see What BLAST's max-target-sequences doesn't do.

ADD COMMENT

Login before adding your answer.

Traffic: 4630 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6