Question: Read DNA sequence from FASTA rising a subclass?
gravatar for Ionic_Bond
4 weeks ago by
Ionic_Bond0 wrote:

Hello everyone,

I am supposed to write a function that takes a name of a file (FASTA) as an argument. When passed the name of the file, the function should read the file, discard the header and return the sequence as a string. Now, I am being asked to rise a predefined (subclass?) (defined before my code) if the sequence part of the file contains characters that are not of the letters A,C,T,G,U. Also, all U nucleotides should be replaced by T in the returned string. I think I am on the right track but have no idea how to incorporate this subclass in my code if any of the letters are not A,C,T,G,U. I am working with a small file before defining the function but this is what I have got:

This is defined before my code:

# Run this cell to define the exception
class BadSequenceException(Exception):


#my code:

file = open("sequence1.fasta")

all_lines = file.readlines()

sequences = []

with open('sequence1.fasta', 'r') as seq:

    sequence = ''

for line in seq:

    if line.startswith('>'):


        sequence = ''


        sequence += line.strip()

def check (sequence, code="ATGCU"):

for x in sequence:

    if x not in code:

        return False

return sequence.replace("U","T")


I presume that the subclasse must be raised where the RETURN FALSE is? Also, BadSequenceException is a subclass of the class Exception and inherits all its functionalities right? Any guidance on this would be very much appreciated. Thank you so much.

bioinformatics python fasta • 143 views
ADD COMMENTlink modified 4 weeks ago by genomax92k • written 4 weeks ago by Ionic_Bond0

Hi! Is this the script that you use? If so, the def check part should be moved to the top.

Also if you're running check after reading the entire file, I think you should run it as you read each line (before sequence += line.strip()).

ADD REPLYlink modified 4 weeks ago • written 4 weeks ago by Fatima830

Thank so much for your help. I will look at it carefully once I get back home after work :)

ADD REPLYlink written 4 weeks ago by Ionic_Bond0

That was very helpful thank you so much.

ADD REPLYlink written 29 days ago by Ionic_Bond0

Indeed, instead of the return False you'd raise BadSequenceException(x + " is not a valid nucleobase") (or something like that).

In addition to that, do you really want to add the empty sequence (upon encountering the first sequence header >) to the set of sequences?

ADD REPLYlink written 4 weeks ago by cschu1812.5k

That makes sense and helped me a lot thank you very much :)

ADD REPLYlink written 29 days ago by Ionic_Bond0
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1763 users visited in the last hour