Hi Everyone,
I have signed up to this forum to ask this question, since I did some searching and did not find an answer; hopefully this is the appropriate forum for such a question. Please excuse my lack of expertise, I am not a bioinformatician by training.
Background:I have 12 amino acid sequences (all variants of the same protein), each ranging from 20-40 residues in length. I need to find a configuration of these sequences that has the highest likelihood of inducing B-cell immunity--basically the highest possible antigenicity score. I will then be cloning that sequence into a viral vector, which adds the practical constraint that the sequence cannot be too long. There are some rules I must follow which reduces the number of possible combinations, but not by too many. My questions are as follows:
1) In your collective expertise, which bioinformatics tools are the best for B-cell epitope prediction? I have found many open source tools which generate antigenicity scores (IEDB Immunogenicty Prediction, SVM Trip, SEPIa). Is there one that is superior? Which parameters are most important for B-cell epitope prediction?
2) What I really want is a script which will try different configurations of the AA sequences until it arrives at maximum antigenicity score/minimum sequence length. Does something like this already exist, or do I have to make it myself? I wrote a function in python which generates all possible combinations. I figure I could write something else which takes each one of those combinations and passes it through whichever prediction tool is best. It seems this might be harder in practice to achieve; I know for example in SEPIa, there is a lot of manual work required (going to different sites, copying and pasting results, etc) and I am no expert programmer so I am not sure how to code for that. Plus, I suspect that the solution to this problem can use any number of the amino acids from any number of the 12 sequences--which means there are millions of potential combinations to score. My sense is that this should probably be something written in parallel and run on a cluster.
Sorry for being verbose, I just wanted to add as much detail as possible. I sincerely appreciate any feedback or thoughts. Have a great day! Jon