Doing a remote blastn against the refseq_rna database for mammals
1
0
Entering edit mode
6.7 years ago
age00 • 0

Hi biostars!

I am using the blastn on the command line to remotely blast against the refseq_rna database but I would like to do this only for mammals. I have 150,000 sequences and would like to limit the search to mammals to avoid exceeding the CPU limit. Is there a way to do this? I know one can download the GIs for a specific taxon but I want to do this remotely and for all mammals, not just a specific species.

This is what I have been working with: blastn -db refseq_rna -query sequences.fsa -out blastn_sequences.out -remote -word_size 11 -gapopen 5 -gapextend 2 -penalty -3 -reward 2 -evalue 0.00001 -num_descriptions 3 -num_alignments 3

Any help is greatly appreciated!

blast SNP alignment • 2.1k views
ADD COMMENT
2
Entering edit mode
6.7 years ago
h.mon 35k

(untested, but should work) I believe something like -entrez_query="mammalia[ORGN]" or -entrez_query="mammals[ORGN]" would work. Also -entrez_query="txid40674[Organism]".

But you shouldn't be using -remote with 150 thousand sequences.

ADD COMMENT
0
Entering edit mode

Thanks h.mon! Any chance you can tell me how to download the mammalian databases so I don't have to use -remote ?

It says I can download databases from ftp://ftp.ncbi.nlm.nih.gov/blast/db/ or retrieve them automatically with update_blastdb.pl but I don't know how to specify only the mammalian ones. Again, thanks for the help!

ADD REPLY
1
Entering edit mode
ADD REPLY

Login before adding your answer.

Traffic: 2633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6