VEP: Generation of polyphen2 & SFT scores for structural variants.
2
0
Entering edit mode
10 months ago
The_PyPanda ▴ 10

Hello, has anybody generated polyphen 2 scores for structural variants reported by Manta using VEP. I have ran VEP but do not have any scores reported and there is no error in my out put report. Of note I am able to generate these scores for called somatic variants. Thus I am wondering do we need to use a particular VEP function to analyze for structural variants? Thank you

Structural-Variant VEP Polyphen2 • 671 views
ADD COMMENT
1
Entering edit mode
10 months ago
LauferVA 4.2k

Hello The_PyPanda ,

Per documentation, PolyPhen-2 (Polymorphism Phenotyping v2) is a tool [sic, which; recte, that] predicts possible impact of an amino acid substitution on the structure and function of a human protein using straightforward physical and comparative considerations.

Now, by convention, structural variants are most commonly defined as any sequence alteration involving 50 or more base pairs. Because a single amino acid is coded by 3 NTs, even very small structural variants typically produce much larger scale changes to a protein than a single amino acid substitution.

As such, sift & polyphen2 scores are typically calculated for SNVs/SNPs, not SVs.

ADD COMMENT
0
Entering edit mode
10 months ago
The_PyPanda ▴ 10

Thank you for your response.

I assumed the above to, but thought there may be a way to force these score calculations. Are you aware of any alternative structural variant scoring methods?

ADD COMMENT
0
Entering edit mode

Regarding the question of proteins specifically, most SV that affect a protein coding gene impact gene function pretty strongly. If you think about a glycine-Tryptophan substitution - that could impact protein function. But alteration of dozens to hundreds of AAs, e.g. by losing an exon can result either in gain or loss of function, depending on what is lost; duplicating an exon most commonly results in LoF; alteration of the location of a gene may abrogate its co-expression with certain other genes with untoward effect. In other words, you'd expect a stronger effect than a SNV, on average.

In a broader sense, genotype to phenotype correlation of SVs is an open problem in biology, recently made much more tractable by the advent of 3rd generation sequencing. Briefly, in the era of NGS/short read sequencing, we were 'missing' a substantial proportion of many SV types. This kind of problem made it difficult to comprehensively characterize SVs - after all, we couldn't characterize what we couldn't detect...

Now that 3rd generation sequencing technology (nanopore, SMRT) has improved SV ascertainment, more comprehensive and more accurate predictive models of the functional effects of SVs can be expected in the future. This will likely take the confluence of well-annotated phenotypic data on gapless, phased human genomes AND improved in silico tools, e.g. deep learning to infer likely effects for those that have never before been seen.

ADD REPLY

Login before adding your answer.

Traffic: 2020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6