Generating Protein Databases From Snp And Indel Information
2
2
Entering edit mode
12.6 years ago
Doug ▴ 20

Using NGS technology I recently detected thousands of SNPs and indels in a yeast strain for which we have proteomic data. I wrote software to generate a protein database from these results. However, i ran into several troubling events including disruption of start codons, stop codons, and itron/exon boundaries. For each case, I made my own judgement calls and moved on. But I would like to compare my results to others. Is anyone aware of software that generates protein fasta files from genomic data? I currently have a .vcf file but could probably convert it int other usable formats if necessary.

I have looked into several variant effect predictor tools including Polyphen2, annovar coding_change.pl), snpEff, and EnsEMBL Variant effect predictor. However, these tools are more focused on predicting phenotypic effect than simply generating a fasta file. They might do what i am looking for but if so I haven't figured out how to do it. I would appreciate any input or feedback on this subject.

proteomics vcf fasta • 3.0k views
ADD COMMENT
1
Entering edit mode
12.4 years ago

Have you though about annotating the variants using VAT which is a part of the VAAST suite? VCF->GVF->annotation is a relatively easy.

ADD COMMENT
0
Entering edit mode

You could then use these annotations to create protein sequences.

ADD REPLY
0
Entering edit mode
12.6 years ago

Because your genome encodes mostly non-spliced or single-exon protein-coding genes, I think that the analysis approach would be rather straightforward. Thus, what comes to mind is the analysis pipeline followed by those looking into pathogenic outbreaks such as EHEC/EAEC O104:H4 in Germany last summer. While the focus of that and similar studies was genome sequencing without proteomic data, they likely employed a rapid screen to identify protein-based differences between a standard, benign strain and the one (or several) isolated during the outbreak.

This topic is not my forte. Just an idea that comes to mind.

ADD COMMENT

Login before adding your answer.

Traffic: 2636 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6