Question: Aligner That Preserves Case?
5
gravatar for Fwip
5.0 years ago by
Fwip480
United States
Fwip480 wrote:

Summary:

I'm looking for an aligner that preserves the case (lower/upper) of the input sequences. So far, I've tried ClustalW, Muscle, MAFFT, and TCoffee, though I could be missing a switch for one of them somewhere.

Reasoning and background:

I'm writing a quick script to find regions of interest, align the coding sequence they fall into, and output a nicely formatted text file.

My thought was to show the short regions in capitals and the remaining sequence in lower-case. I've created the sequences with the correct upper/lower case characters, but when I throw it through ClustalW, it comes out all upper-case.

I'd prefer an option that has a ready-made module from BioPerl (as the rest of my script is in perl), but command-line only options are also okay.

Sample script (but the same happens from the command-line):

use Bio::Tools::Run::Alignment::Clustalw;
use Bio::AlignIO;

my $aligner = Bio::Tools::Run::Alignment::Clustalw->new;
my $alignment = $aligner->align('test.fsa');
my $out = Bio::AlignIO->newFh(-format => 'phylip');
print $out $alignment;

Sample input file:

>test
atgaaaaagaattttattgggaaatcaattttaagcatagctgctattagtttaacggta
tcaacatttgccggtgaatctcatgcacaaactaaggCTGAAAAATATAACGAGTatc
>test_2
atgaaaaaGAATTTATTGGGAAATCaattttaagcatagctgctattagtttaacggtat
caacatttgccggtgaatctcatgcacaaactaaggctgaaaaatataacgagtatca

Output of script (no lowercase):

 2 119
test         ATGAAAAAGA ATTTTATTGG GAAATCAATT TTAAGCATAG CTGCTATTAG TTTAACGGTA 
test_2       ATGAAAAAGA ATTT-ATTGG GAAATCAATT TTAAGCATAG CTGCTATTAG TTTAACGGTA 

             TCAACATTTG CCGGTGAATC TCATGCACAA ACTAAGGCTG AAAAATATAA CGAGTATC- 
             TCAACATTTG CCGGTGAATC TCATGCACAA ACTAAGGCTG AAAAATATAA CGAGTATCA
aligner bioperl msa • 1.2k views
ADD COMMENTlink modified 5.0 years ago by Manu Prestat3.8k • written 5.0 years ago by Fwip480
1

why not just take the output, and use a script to format it again?

ADD REPLYlink written 5.0 years ago by Whetting1.5k
1

I can do that, but there's the potential that gaps will be inserted during the alignment (some of my real sample data exhibits this, before and also inside of the "capital" region, but is too long to post here), and I would have to track that and adjust for it. I've done it before, but it's ugly-looking fragile code.

As sometimes people use case as a way of encoding mask data, I would have expected at least one of the popular aligners to have an option to preserve it.

ADD REPLYlink modified 5.0 years ago • written 5.0 years ago by Fwip480
7
gravatar for Whetting
5.0 years ago by
Whetting1.5k
Bethesda, MD
Whetting1.5k wrote:

check Mafft. According to its manual (http://mafft.cbrc.jp/alignment/software/anysymbol.html) it has the possibility to maintain case...

ADD COMMENTlink written 5.0 years ago by Whetting1.5k

Thank you! This looks like it will work perfectly with the --preservecase option.

ADD REPLYlink written 5.0 years ago by Fwip480
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 852 users visited in the last hour