Question: Faster alternative of NCBI eutils
0
gravatar for Arup Ghosh
2.4 years ago by
Arup Ghosh2.7k
India
Arup Ghosh2.7k wrote:

I'm trying to convert some Biosample ids to SRA is using the following python script but the response time is very high. Is there any faster way to do the same?

#!/usr/bin/python3
import urllib.request
import sys
from bs4 import BeautifulSoup

# example url :://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=biosample&cmd=neighbor_score&linkname=bioproject_sra_all&db=sra&id=235777 
idconv="https://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=biosample&cmd=neighbor_score&linkname=bioproject_sra_all&db=sra&id="
with open(sys.argv[1],"r") as uids:
        for uid in uids:
                url=idconv+uid
                req = urllib.request.Request(url, data=None,headers={'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.47 Safari/537.36'}) 
                print(uid)
                page=urllib.request.urlopen(req)
                soup=BeautifulSoup(page,"lxml")
                print(soup.prettify())
api python ncbi • 665 views
ADD COMMENTlink modified 2.4 years ago by genomax91k • written 2.4 years ago by Arup Ghosh2.7k

Not a programatic approach but you can use batch entrez (https://www.ncbi.nlm.nih.gov/sites/batchentrez) Batch Entrez -> Select Data base BioSample -> Upload biosample id list -> Retrieve Records -> Select summary Text -> Download The file -> Grep "^Identifiers"

ADD REPLYlink modified 2.4 years ago • written 2.4 years ago by microfuge1.8k

Not accepting more than 20 entries at a time.

ADD REPLYlink written 2.4 years ago by Arup Ghosh2.7k

Sorry my bad. You need to choose send to -> file option and then select Summary (Text). The web page by default displays only 20 entries.

ADD REPLYlink written 2.4 years ago by microfuge1.8k
3
gravatar for genomax
2.4 years ago by
genomax91k
United States
genomax91k wrote:

This file probably has the information you need. If not, you should look around on the FTP site linked.

ADD COMMENTlink modified 2.4 years ago • written 2.4 years ago by genomax91k
1
gravatar for Pierre Lindenbaum
2.4 years ago by
France/Nantes/Institut du Thorax - INSERM UMR1087
Pierre Lindenbaum131k wrote:

first idea: group your ids, separated by a comma: e.g: 20 ids per http request

...all&db=sra&id=1,2,3,4,5,6
ADD COMMENTlink written 2.4 years ago by Pierre Lindenbaum131k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1756 users visited in the last hour