How To Retrieve Multiple Sequence In One Python Script
1
0
Entering edit mode
10.3 years ago
ahmedakhokhar ▴ 150

I have a list of Entrez gene IDs, I want to retrieve flanking regions of a mutation in the each (one mutation per) gene. Previously I was using the following code for the retrieval of one entry ...

out_handle = open("example.txt", "w")
from Bio import Entrez, SeqIO
Entrez.email='v.v@biw.kuleuven.be'
handle = Entrez.efetch(db="nucleotide", id="186972394", rettype="fasta", strand=1, seq_start=4000100, seq_stop=4000200, retmode='text')
record = SeqIO.parse (handle, "fasta")
SeqIO.write(record, out_handle, "fasta")
in_handle.close()
out_handle.close()

If some one can help in this regard, as I am totally new to python. Thanks.

python • 4.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode
10.3 years ago

Assuming you have the begin and end of each sequence, format the input file: Entrez_GeneID\tBegin\tEnd and try this


from Bio import Entrez, SeqIO

#open the file with your Entrez gene IDs input_file = open("path/to/to/the/genelist") out_handle = open("example.txt", "w") Entrez.email='v.v@biw.kuleuven.be'

line = input_file.readline()

#this is a loop that goes through every single line of your file while line != "": #Assuming each line it of the format Entrez gene ID\tBegin\tEnd line = line.strip().split('\t') handle = Entrez.efetch(db="nucleotide", id=, rettype=line[0], strand=1, seq_start=line[1], seq_stop=line[2], retmode='text') record = SeqIO.parse (handle, "fasta") SeqIO.write(record, out_handle, "fasta") line = input_file.readline() continue

in_handle.close() out_handle.close() input_file.close()

ADD COMMENT

Login before adding your answer.

Traffic: 2310 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6