Question: Problem with web Blast SRA database
0
gravatar for QVH
4.1 years ago by
QVH10
Upenn
QVH10 wrote:

Hello,

I want to blast some sequences on dozens of huge SRA datasets. And, in order to develop a pipeline, I tried to send different requests from my computer on python. Everything is working fine with the standard parameters (database = nr, for example), but I don't manage to make it work when I want to use a SRA dataset as a Database (DRX029854, for example, with the BLAST_SPEC:SRA parameter).

fasta_file = 'BlastList.fasta'.strip()

database_num = str('DRX029854')

url = ('http://blast.ncbi.nlm.nih.gov/Blast.cgi')

args = {'CMD':'Put','DATABASE':database_num,'PROGRAM':'tblastn','BLAST_SPEC':'SRA',
            'FORMAT_TYPE':'XML','MAX_NUM_SEQ':'20000'
                        ,'WORD_SIZE':'6','FILTER':'F'}

req = requests.post(url,params=args,files={'QUERY': open(fasta_file, 'rb')})

Does someone know which API or any kind of specific parameter I have to use to make it work with SRA databases?

Thanks

sra blast api python • 1.5k views
ADD COMMENTlink modified 4.0 years ago by Biostar ♦♦ 20 • written 4.1 years ago by QVH10

Maybe take a look at this code: https://github.com/Kingsford-Group/sbtappendix/blob/master/srablast/srablast.py

ADD REPLYlink written 4.1 years ago by frcamacho190

It's the same kind of code I'm using, and unfortunately, it's not working. I get this error :

Informational Message: No alias or index file found for nucleotide database [DRX029854] in search path [/export/home/splitd/blastdb/blast0:/blast/db/disk.blast/blast1105::]

I think the problem could be due to the way the database is called.

ADD REPLYlink written 4.1 years ago by QVH10

Appears that the script is looking at the local file paths rather than web for the database.

Are you able to get this to work for some other acc #? Perhaps your local firewall settings are preventing you from going out to NCBI.

Also be aware that at the time I write this NCBI is testing https only access to their site. So you may want to replace that http with https (https://blast.ncbi.nlm.nih.gov/Blast.cgi). just in case.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by genomax91k

It's perfectly working when I try to blast something on a standard database (e.g. 'nr'), for exemple:

fasta_file = 'test-tblastn.fa'.strip()
database_num = str('DRX029854')
url = ('https://blast.ncbi.nlm.nih.gov/Blast.cgi')
args = {'CMD':'Put','DATABASE':'nr','PROGRAM':'tblastn',
            'FORMAT_TYPE':'XML',
                        'WORD_SIZE':'6','FILTER':'F'}

req = requests.post(url,params=args,files={'QUERY': open(fasta_file, 'rb')})

So, I really think the problem is due to the way I call the SRA database.

ADD REPLYlink modified 4.1 years ago • written 4.1 years ago by QVH10

Do you know for sure that NCBI allows remote access to SRA database for blast?

ADD REPLYlink written 4.1 years ago by genomax91k

I contacted NCBI in order to know whether this function has been removed.

ADD REPLYlink written 4.1 years ago by QVH10

Looks like it's not "officially supported" by NCBI, they recommend cloud implementation with Amazon.

ADD REPLYlink written 4.1 years ago by QVH10
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1825 users visited in the last hour