Entering edit mode
5 hours ago
fra.r.silvestro
▴
10
Dear bioinformatician or similar, I am new to Python programming and I have to answer this question: "What is the length of the longest ORF appearing in reading frame 2 of any of the sequences?" I attached the result but I want to customize/change the code so that I could specify ORF appearing just in the reading frame 2.
#######################
from Bio import SeqIO
from Bio.Seq import Seq
#Build a Dictionary
sequences = {record.id: record.seq for record in SeqIO.parse(seq, "fasta")}
max_length_in_frame2 = 0
longest_orf_seq_in_frame2 = ""
for seq_id, sequence in sequences.items():
# Create a Seq object and translate it to protein, specifying the frame
protein_seq_frame2 = sequence.translate() # Frame 1 is the 3rd reading frame (0-indexed)
# Split the protein sequence into ORFs by stop codons (*)
potential_ors = protein_seq_frame2.split("*")
for orf in potential_ors:
# Check if the ORF starts with a start codon (M)
#if orf.startswith("M"):
# Find the length of the ORF
orf_length = len(orf)
# Keep track of the longest ORF found in frame 2 so far
if orf_length > max_length_in_frame2:
max_length_in_frame2 = orf_length
longest_orf_seq_in_frame2 = orf
print(f"The longest ORF in reading frame 2 is {max_length_in_frame2} amino acids long.")
print(f"The sequence is: {longest_orf_seq_in_frame2}")
#####################################
Thanks in advance, Francesco
Generally speaking, we are not here to provide code review service and/or fix code for what appears to be a school assignment. If you have a specific question, especially if not related to assignments, I think you may be more likely to get some feedback. It would also be helpful to format the code properly.