How to get 1 top hit only for each query in blastx
2
0
Entering edit mode
4.2 years ago
2822462298 ▴ 120

I am running blastx using the following command:

blastx -query CDS.fa -db local_path_nr -num_threads 36 -evalue 1e-5 -outfmt 6 -max_target_seqs 1 -out CDS_blast.txt

I aim to get one hit for each query only but there are still, say 2-3 hits, per query occasionally. How do I save the top hit only? I know this can be done post-blast, but can I simply achieve it while running blast? Thanks!

blast RNA-Seq blastx • 3.4k views
ADD COMMENT
1
Entering edit mode

It should be easy for you to parse the output file such that only the top hit is retained.

ADD REPLY
0
Entering edit mode

Duplicate question. You could try the suggested answer from a previous post

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

I guess this command does not apply to blastx

ADD REPLY
3
Entering edit mode
4.2 years ago
Asaf 10k

The max_hsps never really worked properly, they have some explanation for this if I recall. You can use sort to do that:

sort -k1,1 -k11,11g CDS_blast.txt | sort --merge -u  -k1,1

This will sort by query sequence and p-value (assuming default format 6 output), then will select the first result (lowest p-value) for each query

ADD COMMENT
0
Entering edit mode

Thanks Asaf, yeh I agree the built-in option does not work properly.

ADD REPLY
0
Entering edit mode
4.2 years ago
Fatima ▴ 1000

Have you tried this option:

-num_alignments 1

You can also check other options using:

blastx --help
ADD COMMENT
0
Entering edit mode

Hi Fatima, I tried it and I got the same result...

ADD REPLY

Login before adding your answer.

Traffic: 1819 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6