Question: Output EMBOSS Needle consensus sequence Biopython/Python 2.7
0
gravatar for dmenning
3.6 years ago by
dmenning0
United States
dmenning0 wrote:

I have a working python script for EMBOSS Needle that takes the input from two files and outputs the expected paired alignment.

    from Bio.Emboss.Applications import NeedleCommandline
    from Bio import AlignIO

    needle_fname = open("0needleout.txt", "a")

    needle_cline = NeedleCommandline(asequence="2for.fasta", bsequence="2rev.fasta", gapopen=10, gapextend=0.5, outfile="0needle.fasta")
    print(needle_cline)
    needle_fname.write('\n' + str(needle_cline))

    stdout, stderr = needle_cline()
    print stdout + stderr

    align = AlignIO.read("0needle.fasta", "emboss")
    print(align)

    needle_fname.close()

However, instead of the current output showing both sequences and where they overlap, I would like to output the complete consensus sequence including the non overlapped ends.

Current output:

UAF                0 -------------------------------------------------------------------------------------------------------------         0
UAR                1 TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC         50

UAF                0 ---------------------------------------------------------------------------------------------------------------       0
UAR               51 TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG       100

UAF                1 ----------------------------GTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC      38
UAR              101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC     150

UAF               39  GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    88
UAR              151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    200

UAF               89  ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT      138
UAR              201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT      250

UAF              139 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT        188
UAR              251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT        300

UAF              189 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT      238
UAR              301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT     350

UAF              239 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT     288
UAR              351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT    400

UAF              289 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT     338
UAR              401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT    450

UAF              339 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA    388
UAR              451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA    500

UAF              389 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    438
UAR              501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    550

UAF              439 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG    477
UAR              551 GTGTGGGGGTTTCTATGTTGAAACTATACCTG---------------    582

 

Desired output, ideally in .fasta format:

UAC                1   TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC        50
UAC               51  TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG      100
UAC              101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC     150
UAC              151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    200
UAC              201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT       250
UAC              251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT         300
UAC              301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT      350
UAC              351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT     400
UAC              401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT      450
UAC              451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA     500
UAC              501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    550
UAC              551 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG                              589

 

Any suggestions?

sequence alignment • 1.1k views
ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by dmenning0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 896 users visited in the last hour