Question: How To Get Conservation Of Vaccine Candidates Between Hundreds Of Bacterial Strains?
Hi community,

We have identified vaccine candidates via an ex-vivo RNA-seq approach. Next step would be to perform conservation of these candidates (about 20) between multiple bacterial strains (about 200). I would like to know your suggestions to perform this step.

From my bioinformatic experience, it could be possible to start with a first rough screening using blastp or blastx against a multiple strain database, and more accurate approaches would be then performed to validate this results.

Please let me know your suggestions and possible references you might consider helpful to answer this question.

Thanks for your help. Best regards, Bernardo

There may be better methods but what you list seems like a good start. Then once you identify similar regions you can perform a multiple sequence alignment on those regions.

Do you also have whole genome sequences for the isolates? If yes, perhaps you could first do a core genome analysis to determine orthologous genes present in all the bacterial isolates. Ideally, a candidate gene for protein based vaccines should be in the 'core genome' i.e. present in all the bacterial isolates so that protection by the vaccine should not be strain specific. To do this you could use a number of tools i.e. OrthoMCL, MCL, ProteinOrtho, COGsoft and many others. Other analyses would be performed to determine other required characteristics of the candidate genes e.g. genetic diversity (less diverse genes would make better candidates I think).

