Entering edit mode
6.0 years ago
pmramos95
•
0
Hi everyone, this is my first post and I'm needing help for a project in a bioinformatics class I am in. What I am trying to do is pull influenza A H3N2 DNA sequences from an Entrez.esearch and determining which sequences contain the restriction enzymes EcORI, BamHI, and HindIII.
Here is my code so far:
from Bio import Entrez, SeqIO
Entrez.email = "email.here"
# let's do a search for influenza H1N1 viruses from Texas
handle = Entrez.esearch(db="nucleotide", # database to search
term="influenza a virus texas h3n2 2017 hemagglutinin complete cds", # search term
retmax=200 # number of results that are returned
)
record = Entrez.read(handle)
handle.close()
gi_list = record["IdList"] # list of genbank identifiers found
handle = Entrez.efetch(db="nucleotide", id=gi_list, rettype="gb", retmode="text")
data = handle.read()
handle.close() # close the handle
# Write data to a file
with open("influenza_HA.gb", "w") as outfile:
outfile.write(data)
import re
# Open the genbank file with flu sequences
in_handle = open("influenza_HA.gb", "r")
records = SeqIO.parse(in_handle, "genbank")
#Find restriction enzyme sequences for EcORI, BamHI, and HindIII
def count_re(seq, csv_file):
re_dict = {}
#Loop over
for record in records:
match_eco = re.search(r"GAATTC", str(record.seq))
match_bam = re.search(r"GGATCC", str(record.seq))
match_hin = re.search(r"AAGCTT", str(record.seq))
if match_eco:
eco_count = 1
else:
eco_count = 0
if match_bam:
bam_count = 1
else:
bam_count = 0
if match_hin:
hin_count = 1
else:
hin_count = 0
#open file for writing
with open(csv_file, "w") as file:
# sort re
re_list=sorted(re_dict.keys())
file.write("ID, EcORI, BamHI, HindIII")
# loop over sorted keys
for re in re_list:
# new line with values separated by ","
line = str(re) + ',' + str(re_dict[re]) + '\n'
# write to the file
file.write(line)
count_re(in_handle, "project3_re_1.csv")
Is there a specific question here? Is this program not working?
Since this is a class assignment (thank you for stating that up front) you should be specific about what you need help with.
I'm just needing help getting the information into a csv file. I would like the header of the excel file to read as something along the lines of:
I hope this helps. Sorry about the formatting its my first time posting to forums like this and I'm very inexperienced in programming.