blastn command line won't find genes in large database
1
0
Entering edit mode
2.2 years ago

I want to check a gene in a bacterial species. I downloaded almost 1500 assemblies and made a blast database. When I blast kinda of a "housekeeping", the blast didn't find it in all strains. When I made a database from one of such genomes that blast didn't find the gene in them, it will find it as supposed to do. The similarity is ~99%. Is the problem the big size of the database and are there some parameters to improve search?

My command line is

blastn -db strains.fasta -query gene.fasta -out gene.out -outfmt 6

blast • 612 views
ADD COMMENT
2
Entering edit mode
2.2 years ago
Mensur Dlakic ★ 27k

By default, most BLAST programs show 250 or 500 matches, depending on the type of output. You probably have more than that when using the large database, so the output is truncated. This can be fixed by setting -max_target_seqs to an arbitrarily large number:

blastn -db strains.fasta -query gene.fasta -out gene.out -outfmt 6 -max_target_seqs 10000

If you are using output format <= 4, this should work:

blastn -db strains.fasta -query gene.fasta -out gene.out -outfmt 0 -num_descriptions 10000 -num_alignments 10000
ADD COMMENT
0
Entering edit mode

Thank you very much. That solved the problem. I've used to do this in the blast online version but it totally slipped my mind when dealing with the command line version.

ADD REPLY

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6