There are several bioinformatics tools available to annotate protein sequence variations, such as insertions and substitutions, and to predict their functional impact in bacterial proteins. One suitable tool is PROVEAN (Protein Variation Effect Analyzer), which predicts whether an amino acid substitution or insertion/deletion affects the biological function of a protein. It works for any organism, including bacteria, and uses a reference protein sequence along with a list of variants to compute scores indicating deleterious or neutral effects.
To use PROVEAN, you can access its web server at http://provean.jcvi.org/index.php. Prepare your input as a FASTA-formatted reference sequence and a variant file listing changes (e.g., substitutions as "A123V" or insertions as "A123_A124insG"). The tool will output predictions with supporting evidence from sequence conservation.
Another tool is CAPRIB (Comparative Analyses of Proteins In Bacteria), designed specifically for analyzing amino acid changes in large bacterial protein datasets. It identifies variations, computes selection pressures, and annotates potential functional impacts by comparing sequences across strains or locations. CAPRIB is available as a web application at https://caprib.tau.ac.il/, where you upload your protein sequences in FASTA format for analysis.
Before using these tools, align your protein sequences to identify the exact variations using a multiple sequence alignment program such as MUSCLE. You can run MUSCLE from the command line as follows:
muscle -in input_sequences.fasta -out aligned_sequences.fasta
This alignment will help you catalog insertions and substitutions relative to a reference sequence.
For biochemical changes, as you specified, these tools focus on functional predictions rather than structural modeling.
Kevin
Can you be more specific about what kind of changes you are interested in? For example, are the biochemical changes or structural changes particularly interesting?
For the latter, you could use one of the many tools that predicts secondary structure from sequence to infer differences. You could also use AlphaFold2/3 to model the structures and cluster/compare them with something like FoldSeek.
Biochemical changes
Have a look at tools like SnpEff or Ensembl VEP. Perhaps your bacterial sp is included.