How do I load more than 200 nucleotide EST sequences into fasta files from NCBI search?
I would like to load all of the sequences from that search into one fasta file. I know that the entrez utilities exist, but they are not installed on the server I am working in.
Also how does entrez output to a file? I know I would want something like the code below, but I don't want to flood the terminal if I do install this.
esearch -db est -query "txid6200[Organism:exp] " | \
efetch -format fasta
Problem solved with BioPython! Thanks for the help.
Edit, here's my code:
# command line usage: python entrez.py database searchterm output.fasta
from Bio import Entrez, SeqIO
dataBase = sys.argv
searchTerm = sys.argv
outFile = sys.argv
Entrez.email = "firstname.lastname@example.org"
handle = Entrez.esearch(db = dataBase, retmax = 100000, term = searchTerm)
record = Entrez.read(handle)
with open(outFile, 'w') as w:
for id in record["IdList"]:
fetch_handle = Entrez.efetch(db = dataBase, id = id, rettype = "fasta", retmode="text")
fetch_record = SeqIO.read(fetch_handle, "fasta")
SeqIO.write(fetch_record, "current_seq.fasta", "fasta")
for line in open('current_seq.fasta'):
Traffic: 1539 users visited in the last hour