I want to find every repeats codon sequence in DNA sequences using Python
so, if i input the sequence, find repeat type and time. absolutely USING PYTHON
Thanx to guys!!!
This looks like a homework assignment, is it? I think this has something to do with regular expression, and Perl handles it well, too.
You don't need Perl to use regular expressions. They are two very different languages.
Do you mean SSR?
It's very easy to do in python using the count function:
from itertools import product
[sequence.count(''.join(codon)) for codon in product('ATCG', repeat=3)]
Or get the results in a dictionary (you can figure out how to get the max etc yourself):
dict(("".join(codon), sequence.upper().count(''.join(codon))) for codon in product("ATCG", repeat=3))
(edited after brentp's correct comment below)
+1. Minor nitpick, to get the results in a dictionary, you'd use:
dict(("".join(codon), seq.count(''.join(codon))) for codon in itertools.product("ATCG", repeat=3))
otherwise, you get a list of dictionaries of length 1.
Nice solution; with the caveat that it should be 'atcg' to work for the example sequence given in the question.
You are correct Neil, that's the risk of having a Python session already open with some sequence data loaded. For safety, you can also use: sequence.upper().count(...) which works on both lower and uppercase data.
Should we really accept doing homework assignments?
A simple invitation to use:
would have given 'wang' all the basic tools to start this, including:
And he would have learned to use the help() method and the built-in documentation which are fundamental Python tools.
Now he just has a copy-and-paste answer that has taught him nothing...
Login before adding your answer.
Use of this site constitutes acceptance of our User Agreement and Privacy