Question: New to Python. Need help writing a program comparing 2 DNA sequences.
gravatar for Tdemarco
5.2 years ago by
United States
Tdemarco10 wrote:

Hi all, I am new to this forum and very NEW to python. I was given the task of finding all the 5 amino acid sequences that are identical between two given DNA sequences. Like I said I am very new to this. I have a copy of Python for Biologists. Have read the chapters we were have supposed to read and have no idea where to begin. Any help or direction would be greatly appreciated.

python • 4.6k views
ADD COMMENTlink modified 5.2 years ago • written 5.2 years ago by Tdemarco10

Hi -- you may appreciateĀ, has some practical example python problems and solutions :)

ADD REPLYlink written 5.2 years ago by Nancy Ouyang170

You need to know how to open the files and read the contents. Then some for loop technique that helps in comparing the lines between the files and get common lines. Once this is done...

Read basics of Object oriented programming and follow some tutorial about biopython. You will learn how to read fasta files. Then compare the sequences in both the files.

ADD REPLYlink written 5.2 years ago by geek_y11k

Thank you all. Doing some reading now and trying to make sense of it. This is a basic Bioinformatics course. We haven't really USED python in class yet.. (we have done other things)

It is hard to read material and be expected to write programs without practice.. Thanks for directing me to the right places.

ADD REPLYlink written 5.2 years ago by Tdemarco10

Actually, after downloading the files I see they are already in AA code.. Just have to find how many 5-mers are identical between the two sequences.

ADD REPLYlink written 5.2 years ago by Tdemarco10
gravatar for Felix Francis
5.2 years ago by
Felix Francis500
United States/University of Delaware
Felix Francis500 wrote:

1) Translate the nulceotide (DNA) sequences (simple bio-python solution).

  • >>> from Bio.Seq import Seq
    >>> from Bio.Alphabet import generic_dna
    >>> coding_dna = Seq("ATGGCCATTGTAATGGGCCGCTGAAAGGGTGCCCGATAG", generic_dna)
    >>> coding_dna.translate()
    Seq('MAIVMGR*KGAR*', HasStopCodon(ExtendedIUPACProtein(), '*'))

2) Use pattern matching to identify shared amino acid k-mers:

You could modify the following for that:

ADD COMMENTlink written 5.2 years ago by Felix Francis500
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1559 users visited in the last hour