Question: download sequences form ncbi using python
0
gravatar for HZZ0036
22 months ago by
HZZ00360
HZZ00360 wrote:

Hi, I have trouble to down and save sequences from ncbi at one time. I get accession numbers using script:

from Bio import Entrez
def singleEntry(singleID):   #the singleID is the accession number
    handle = Entrez.efetch(db='nucleotide',id=singleID, rettype = 'fasta', retmode= 'text')
    f = open('%s.fasta' % singleID, 'w')
    f.write(handle.read())
    handle.close()
    f.close()


#get an id list: this makes a big search and gets a list of id 

handle = Entrez.esearch(db='nucleotide', term = ["Poaceae[Orgn] AND als[Gene]"])
record = Entrez.read(handle)
handle.close()
print (record["IdList"])

I got IdList:

['1124779319', '1058275694', '160346987', '160346985', '313662298', '313662296', '313662294', '313662292', '148536620', '148536618', '944203885', '937553934', '698322664', '698322662', '698322660', '698322658', '683428019', '677285963', '677285961', '677285959']

Then, how to download those fasta sequences to one file? Thanks. I tried this:

from Bio import Entrez, SeqIO
def get_sequences(IdList):
    ids = record["IdList"]
    for seq_id in ids:
    handle = Entrez.efetch(db="nucleotide", id="seq_id", rettype="fasta", retmode="text")
    record = handle.read()
    record = open('als.fasta', 'w')
    record.write(record.rstrip('\n'))

but it showed: IndentationError: expected an indented block

sequence • 3.3k views
ADD COMMENTlink modified 4 weeks ago by brunofede220 • written 22 months ago by HZZ00360

hi, did you solve this?, i need to do the same

ADD REPLYlink written 4 weeks ago by brunofede220
1
gravatar for Istvan Albert
22 months ago by
Istvan Albert ♦♦ 80k
University Park, USA
Istvan Albert ♦♦ 80k wrote:

Your for loop needs to be indented.

This is not really a bioinformatics question but a Python programming question and as such it is better suited for https://stackoverflow.com/

record = open('als.fasta', 'w')
for seq_id in ids:
    handle = Entrez.efetch(db="nucleotide", id="seq_id", rettype="fasta", retmode="text")
    record = handle.read()
    record.write(record.rstrip('\n'))
ADD COMMENTlink modified 22 months ago • written 22 months ago by Istvan Albert ♦♦ 80k

Ehm I also think OP and you are reusing/overwriting the variable name record.

ADD REPLYlink written 22 months ago by WouterDeCoster39k

Thanks. The error has been solved, but there is no als.fasta file even I add the path.

ADD REPLYlink written 22 months ago by HZZ00360

I guess that's because you overwrite the record variable, you use it both for opening als.fasta and for reading the handle from efetch().

ADD REPLYlink written 22 months ago by WouterDeCoster39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour