Question: Ignore Error "Multiple Sequences Found With Same Name" In Clustalw
0
gravatar for david
4.8 years ago by
david0
david0 wrote:

Hi,

I have a python program generating a clustalw2 alignment of about 500 sequences from a fasta file. The names of the sequences correspond to the respective organisms plus the substrate specificity of a given sequence. Therefore quite a few of these names are identical and i get the error message: "Error: Multiple sequences found with same name" and no alignment is generated. Is it possible to ignore this error without having to change all the sequence names?

Cheers David

clustalw biopython • 2.1k views
ADD COMMENTlink modified 12 days ago by Biostar ♦♦ 20 • written 4.8 years ago by david0
6
gravatar for a.zielezinski
4.8 years ago by
a.zielezinski7.3k
a.zielezinski7.3k wrote:

The names of the sequences must be unique to do alignment in ClustalW/X.

I would name your 500 sequences as numbers from 0 to 499 and store the original names in a dictionary or a list.

For example:

d = {1: 'Organism1Substrate', 2:'Organism1Substrate' , ..., 499:'Organism2Substrate'}

or:

l = ['Organism1Substrate', 'Organism1Substrate', 'Organism1Substrate', ..]

Once you performed the alignment, just replace the numbers with original names.

ADD COMMENTlink modified 4.8 years ago • written 4.8 years ago by a.zielezinski7.3k
1

+1 for this. In the past I have just GREPed the names and added numbers or more information to make them unique, but I like this idea better.

ADD REPLYlink written 4.8 years ago by Josh Herr5.5k
1

Agree. Many phylogenetic programs have problems handling fancy sequence names. The horrible case is phylip format (used by RAxML etc) which allows only 10 characters per name. So I always rename the sequences as "s1", "s2", s3"... I don't recommend using 1, 2, 3... because some programs cannot handle numerical sequence names.

ADD REPLYlink written 4.8 years ago by qiyunzhu420
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1431 users visited in the last hour