Open Reading Frame And Codons
3
2
Entering edit mode
12.2 years ago

How I can find the ORF in a sequence using Python? also I need find all codons.

Thanks.

orf sequence python codon • 6.6k views
ADD COMMENT
7
Entering edit mode
12.2 years ago
Nikolay Vyahhi ★ 1.3k

List of all codons:

genome = 'ACGTACGT....'
print map(lambda x: ''.join(x), zip(genome[0:], genome[1:], genome[2:]))

Set of all codons:

genome = 'ACGTACGT....'
print set(map(lambda x: ''.join(x), zip(genome[0:], genome[1:], genome[2:])))

If your genome is large, use itertools.izip instead of zip:

import itertools
itertools.izip(genome[0:], genome[1:], genome[2:])

To find ORF it's better to use Biopython (see zev.kronenberg's link).

ADD COMMENT
0
Entering edit mode

If you genomes are large, use itertools.izip instead of zip: import itertools; itertools.izip(genome[0:], genome[1:], genome[2:])

ADD REPLY
5
Entering edit mode
ADD COMMENT
0
Entering edit mode
12.2 years ago

Thanks,

I worked with the fasta file NC_005816.fna following the steps indicated in (http://biopython.org/DIST/docs/tutorial/Tutorial.html#htoc224), but if I compare this ORF's with the obtained using toolbox WEB of NCBI http://www.ncbi.nlm.nih.gov/projects/gorf/orfig.cgi the results are differents, why?

I need to use python because the file of my sequence is 4GB. I can't use NBCI toolbox.

Thanks.

ADD COMMENT
0
Entering edit mode

would be more appropriate as a comment. possible reason for difference: different genetic codes.

ADD REPLY

Login before adding your answer.

Traffic: 2708 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6