Question: File format conversion to clustal using biopython
0
gravatar for mdsiddra
8 months ago by
mdsiddra20
mdsiddra20 wrote:

I am using biopython for converting my aligned sequence file from one format to another using "AlignIO.write" function. The thing I want to know is that,

  1. When I convert a sequence file given in phylip format to a file of clustal format, the resulted clustal file is of version CLUSTAL X (1.81) while I want an output in the form of CLUSTAL W / CLUSTAL 2.1 or so (higher version of clustal).
  2. The resulted clustal file does not include the line in the alignment with symbols ('*','.',':') for the amino acids in the sequence file.

This is the resulted file I am getting.

CLUSTAL X (1.81) multiple sequence alignment


Canis           REWSSARPERSKGRRKPVDAAAVSAVQTSQTSSDVAVSSSCRSMEMQDLTSPHSRLSGSS
Mus             -------------------------------------------MEMQDLTSPHSRLSGSS
Rattus          -------------------------------------------MEMQDLTSPHSRLSGSS

While this is the file format I want to get, including the symbols line indicating the similarity of the residues.

Canis           REWSSARPERSKGRRKPVDAAAVSAVQTSQTSSDVAVSSSCRSMEMQDLTSPHSRLSGSS
Mus             -------------------------------------------MEMQDLTSPHSRLSGSS
Rattus          -------------------------------------------MEMQDLTSPHSRLSGSS
                                                           *****************

Can I do this with biopython or I have to use some other method or function??

biopython • 372 views
ADD COMMENTlink written 8 months ago by mdsiddra20

Hello mdsiddra,

could you please also provide an example of the phylip input file?

fin swimmer

ADD REPLYlink written 8 months ago by finswimmer11k

Yes, This is how a phylip file look like:

14 327
Zebrafish  LELQGEESDL DFRLSLNGKE DLLDTGQSLS SCGVVSGDLI SVILPASLEE
Fugu       LELQGEEAET EISLSLNGSE PLEDTGQTLA SCGIVSGDLI RVALIRALMA
Chicken    LELEGAESDT EFSITLNGKD ALTEDEKTLA SYGIVPGDLI CLLLEEDLPP
Zebra      SMTEGNRSDT AFSVTLNRKD ALTEDQKTLA SYGIVSGDLI CLLLEEDLPP

           TQSSAAAHGG SHHVQEDQVD QQQECVDLQQ DDQQQQQEQV CAAAPPLLCC
           ADPDRADDGG GHAVAMNQVS QEAKLPDASG ADSDQAPGPA ASCWEPMLCS
           PSSSPPSLLT PKRQNEQVDS RAGSSLEFPS GPEDVDLEEG SYPSEPMLCS
           PPATPAPLLT PNGQNEQVDE RAGSSLEFPS GPEDADLEEG SYPSEPMLCS
ADD REPLYlink written 8 months ago by mdsiddra20

Hi,

I think seqret from EMBOSS can do the job. You can set the output format as clustal.

ADD REPLYlink written 8 months ago by Sishuo Wang170

I don't want it this way. As I am using python/biopython codes, so I wish to use some source code for this purpose.

ADD REPLYlink written 8 months ago by mdsiddra20

What version of Biopython are you on?

ADD REPLYlink written 8 months ago by jrj.healey12k

Python 3.6 and biopython 1.72

ADD REPLYlink modified 8 months ago • written 8 months ago by mdsiddra20
1
gravatar for jrj.healey
8 months ago by
jrj.healey12k
United Kingdom
jrj.healey12k wrote:

According to the documentation, BioPython does not yet support writing to Clustal 2 formats.

You can try scripting it yourself, or simply realign with clustal and output the format directly.

ADD COMMENTlink written 8 months ago by jrj.healey12k

alright , thankyou for response.

ADD REPLYlink written 8 months ago by mdsiddra20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1111 users visited in the last hour