Given a Bio::SimpleAlign, what is the best way to get per-column conservation scores. E.g. into an array of values in [0:1] where the array length would be the same as $align->length. I don't find anything like this in Bio::SimpleAlign. I'm looking for a function that allows:
my $io = Bio::AlignIO->new(-file=>$file); my $align = $io->next_aln; my @cons = $align->percentage_identity_by_column(); # <- does this exist? print "@cons"; # 0.75 1.0 1.0 1.0 0.64 ....
Or should I just concat the gapped sequence, use substr() to extract the characters and count them with a hash and return the frequency of the most frequent character per column?
It looks like the private method Bio::SimpleAlign::consensusaa() already does most of this, but it returns the character rather than the fraction, which is what I was looking for. Short of submitting a patch for that, is there a better approach?