Question

Take out the GENES from DNA sequence while using FRMAES.

0

Entering edit mode

5.5 years ago

abdullahqamer92 • 0

I want to make a Python Program in which a DNA sequence is given in a text file. It has more than 9000 characters. I have to cut the sequence in 3 characters so our frame reads from 1 to 3, then 4 to 6, then 7 to 9, which is called as codons.

For Example the sequence is

ACCTGCCTCTTACGAGGCGACACTCCACCATGGATCACTCCCCTGTGAGGAACTACTGTCTTCACGCAGA

then I have to cut it in 3 characters. Which I have already done it. My question is how can I take out the GENE sequence from the given DNA? GENE sequence starts from ATG and end it on TAG or TAA or TGA.

It is easy to do if I use Regular Expression. But the problem is if you look at the above sequence the ATG is coming from 30th position to 32nd. While our frame reads from 1 to 3 then 4 to 6. In this case when it reaches to 28th to 30th, it doesn't make ATG.

Can anyone understand my problem and please help me? I'm sharing my code now:

import numpy as np
import pandas as pd
import re
from pathlib import Path
dna = Path('C:/Users/abdul/Downloads/Compressed/MAJU/HCV-PK1-sequence - edited.txt').read_text()
l = [c for c in dna if c!='\n']
r = len(l)
for x in range(0,r,3):
    y=x+3
    codon = l[x:y]
    a = ''.join(codon)
    print(a)
if(a == re.findall('ATG(...)+?(TAG|TAA|TGA)', dna)):
    print("Yes")

DNA to GENES in Python Python 3x BioPython • 979 views

ADD COMMENT • link 5.5 years ago by abdullahqamer92 • 0

0

Entering edit mode

Not sure if you're aware of this, but you're looking for [open] reading frames. There are a lot of tools that already do this for you, is there a reason you're trying to write a new tool?

Check out EMBOSS' getorf - that's one of the tools I've used to detect open reading frames.

ADD REPLY • link 5.5 years ago by Ram 43k

0

Entering edit mode

I know Expassy do it for you. But I have to write a code for it as our teacher gave us the program to do it. I have done the first part but I'm stuck at the 2nd part. I'm trying my best to do it. Hope you'll give me some ideas

ADD REPLY • link 5.5 years ago by abdullahqamer92 • 0

0

Entering edit mode

I think you're stuck because you're only reading one frame of the sequence. If you were to do a little research (just Googling) on reading frames, you'd see the one piece of knowledge that will solve your problem.

ADD REPLY • link 5.5 years ago by Ram 43k