Codon Alignment Via Python?
1
0
Entering edit mode
7.7 years ago
a1ultima ▴ 780

I have pairs of coding DNA sequences which I wish to perform pairwise codon alignments via Python, I have "half completed" the process.

So far..

  • I retrive pairs of orthologous DNA sequences from genbank using Biopython package.
  • I translate the orthologous pairs into peptide sequences and then align them using EMBOSS Needle program.

I wish to..

  • Transfer the gaps from the peptide sequences into the original DNA sequences.

Question

I would appreciate suggestions for programs/code (called from Python) that can transfer gaps from aligned peptide sequence pairs onto codons of the corresponding nucleotide sequence pairs. Or programs/code that can carry out the pairwise codon alignment from scratch.

enter image description here

biopython python alignment codon • 4.9k views
ADD COMMENT
1
Entering edit mode

check for some insights in coding for codon alignment http://zruanweb.com/

ADD REPLY
1
Entering edit mode

Don't reinvent the wheel unless you simply like creating wheels:) Use: http://translatorx.co.uk/ If you still want to do this in Python, paste some of your code, we will make this work.

ADD REPLY
1
Entering edit mode

This sounds like a simple loop iterating each sequence of the proteins, writing three letters from the DNA if not a gap and three gaps if it is. This is assuming you're in-frame.

ADD REPLY
0
Entering edit mode

cheers, in the end I did something like that eventually and posted the answer to my own question

ADD REPLY
0
Entering edit mode

Are your DNA sequences already in the correct frame?

ADD REPLY
2
Entering edit mode
7.7 years ago
a1ultima ▴ 780

In the end I made my own Python function that transfers gaps ('-') from the peptide sequence to the nucleotide sequence (codons).

It takes an aligned peptide sequence with gaps and the corresponding un-aligned nucleotide sequence and gives an aligned nucleotide sequence:

Function

def gapsFromPeptide( peptide_seq, nucleotide_seq ):
    """ Transfers gaps from aligned peptide seq into codon partitioned nucleotide seq (codon alignment) 
          - peptide_seq is an aligned peptide sequence with gaps that need to be transferred to nucleotide seq
          - nucleotide_seq is an un-aligned dna sequence whose codons translate to peptide seq"""
    def chunks(l, n):
        """ Yield successive n-sized chunks from l."""
        for i in xrange(0, len(l), n):
            yield l[i:i+n]
    codons = [codon for codon in chunks(nucleotide_seq,3)]  #splits nucleotides into codons (triplets) 
    gappedCodons = []
    codonCount = 0
    for aa in peptide_seq:  #adds '---' gaps to nucleotide seq corresponding to peptide
        if aa!='-':
            gappedCodons.append(codons[codonCount])
            codonCount += 1
        else:
            gappedCodons.append('---')
    return(''.join(gappedCodons))

Usage

>>> unaligned_dna_seq = 'ATGATGATG'
>>> aligned_peptide_seq = 'M-MM'
>>> aligned_dna_seq = gapsFromPeptide(aligned_peptide_seq, unaligned_dna_seq)
>>> print(aligned_dna_seq)

    ATG---ATGATG
ADD COMMENT
1
Entering edit mode

+1 for sharing it

ADD REPLY

Login before adding your answer.

Traffic: 1148 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6