Using NGS technology I recently detected thousands of SNPs and indels in a yeast strain for which we have proteomic data. I wrote software to generate a protein database from these results. However, i ran into several troubling events including disruption of start codons, stop codons, and itron/exon boundaries. For each case, I made my own judgement calls and moved on. But I would like to compare my results to others. Is anyone aware of software that generates protein fasta files from genomic data? I currently have a .vcf file but could probably convert it int other usable formats if necessary.
I have looked into several variant effect predictor tools including Polyphen2, annovar coding_change.pl), snpEff, and EnsEMBL Variant effect predictor. However, these tools are more focused on predicting phenotypic effect than simply generating a fasta file. They might do what i am looking for but if so I haven't figured out how to do it. I would appreciate any input or feedback on this subject.
You could then use these annotations to create protein sequences.