Question: File format conversion to clustal using biopython
0
gravatar for mdsiddra
18 months ago by
mdsiddra30
mdsiddra30 wrote:

I am using biopython for converting my aligned sequence file from one format to another using "AlignIO.write" function. The thing I want to know is that,

  1. When I convert a sequence file given in phylip format to a file of clustal format, the resulted clustal file is of version CLUSTAL X (1.81) while I want an output in the form of CLUSTAL W / CLUSTAL 2.1 or so (higher version of clustal).
  2. The resulted clustal file does not include the line in the alignment with symbols ('*','.',':') for the amino acids in the sequence file.

This is the resulted file I am getting.

CLUSTAL X (1.81) multiple sequence alignment


Canis           REWSSARPERSKGRRKPVDAAAVSAVQTSQTSSDVAVSSSCRSMEMQDLTSPHSRLSGSS
Mus             -------------------------------------------MEMQDLTSPHSRLSGSS
Rattus          -------------------------------------------MEMQDLTSPHSRLSGSS

While this is the file format I want to get, including the symbols line indicating the similarity of the residues.

Canis           REWSSARPERSKGRRKPVDAAAVSAVQTSQTSSDVAVSSSCRSMEMQDLTSPHSRLSGSS
Mus             -------------------------------------------MEMQDLTSPHSRLSGSS
Rattus          -------------------------------------------MEMQDLTSPHSRLSGSS
                                                           *****************

Can I do this with biopython or I have to use some other method or function??

biopython • 637 views
ADD COMMENTlink written 18 months ago by mdsiddra30

Hello mdsiddra,

could you please also provide an example of the phylip input file?

fin swimmer

ADD REPLYlink written 18 months ago by finswimmer13k

Yes, This is how a phylip file look like:

14 327
Zebrafish  LELQGEESDL DFRLSLNGKE DLLDTGQSLS SCGVVSGDLI SVILPASLEE
Fugu       LELQGEEAET EISLSLNGSE PLEDTGQTLA SCGIVSGDLI RVALIRALMA
Chicken    LELEGAESDT EFSITLNGKD ALTEDEKTLA SYGIVPGDLI CLLLEEDLPP
Zebra      SMTEGNRSDT AFSVTLNRKD ALTEDQKTLA SYGIVSGDLI CLLLEEDLPP

           TQSSAAAHGG SHHVQEDQVD QQQECVDLQQ DDQQQQQEQV CAAAPPLLCC
           ADPDRADDGG GHAVAMNQVS QEAKLPDASG ADSDQAPGPA ASCWEPMLCS
           PSSSPPSLLT PKRQNEQVDS RAGSSLEFPS GPEDVDLEEG SYPSEPMLCS
           PPATPAPLLT PNGQNEQVDE RAGSSLEFPS GPEDADLEEG SYPSEPMLCS
ADD REPLYlink written 18 months ago by mdsiddra30

Hi,

I think seqret from EMBOSS can do the job. You can set the output format as clustal.

ADD REPLYlink written 18 months ago by Sishuo Wang190

I don't want it this way. As I am using python/biopython codes, so I wish to use some source code for this purpose.

ADD REPLYlink written 18 months ago by mdsiddra30

What version of Biopython are you on?

ADD REPLYlink written 18 months ago by Joe16k

Python 3.6 and biopython 1.72

ADD REPLYlink modified 18 months ago • written 18 months ago by mdsiddra30
1
gravatar for Joe
18 months ago by
Joe16k
United Kingdom
Joe16k wrote:

According to the documentation, BioPython does not yet support writing to Clustal 2 formats.

You can try scripting it yourself, or simply realign with clustal and output the format directly.

ADD COMMENTlink written 18 months ago by Joe16k

alright , thankyou for response.

ADD REPLYlink written 18 months ago by mdsiddra30
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 807 users visited in the last hour