Tool:Phevor: Phenotype Driven Variant Ontological Re-Ranking Tool
Entering edit mode
8.9 years ago

Phevor integrates phenotype, gene function, and disease information with personal genomic data for improved power to identify disease-causing alleles. Phevor works by combining knowledge resident in multiple biomedical ontologies with the outputs of variant prioritization tools. It does so using an algorithm that propagates information across and between ontologies. This process enables Phevor to accurately re-prioritize potentially damaging alleles identified by variant prioritization tools in light of gene function, disease, and phenotype knowledge.

Phevor is especially useful for single exome and family trio-based diagnostic analyses, the most commonly occurring clinical scenarios, and ones for which existing personal-genomes diagnostic tools are most inaccurate and underpowered.

Importantly, Phevor is not limited to known diseases, or known disease-causing alleles. Phevor can also use latent information in ontologies to discover genes and disease causing-alleles not previously associated with disease.

Case studies

Phevor Combines Multiple Biomedical Ontologies for Accurate Identification of Disease-Causing Alleles in Single Individuals and Small Nuclear Families Am J Hum Genet. 2014 Apr 3;94(4):599-610

  • A Disease-Gene Association for NFKB2 - Two families were affected by autosomal-dominant, early-onset hypogammaglobulinemia with variable auto- immune features and adrenal insufficiency. Four individuals from family A were sequenced, and the affected individual in family B was sequenced. Two different NFKB2 mutations were identified (different ones in family A and B)., and was the top ranked candidate from the combined VAAST and Phevor analysis. NFKB2 is a gene that had previously not been identified to be associated with this disease.

    This case demonstrates Phevor's ability to identify a human gene not currently associated with a disease or phenotype in the Human Phenotype Ontology, Disease Ontology, or Mammalian Phenotype Ontology.

  • An Atypical Phenotype Caused by a Dominant Allele of STAT1 - The patient was a 12 year-old male with severe diarrhea in the context of intestinal inflammation, total villous atrophy, and hypothyroidism. His clinical picture was life threatening, warranting hematopoietic stem cell transplantation despite diagnostic uncertainty. He was diagnosed with diagnosis of X-linked immunodysregulation, polyendocrinopathy, and enteropathy, but targeted sequencing of genes associated with these conditions revealed no pathogenic variants. Exome sequencing was conducted for the patient and both parents. A combined VAAST-Phevor analysis ranked a STAT1 de novo mutation as the top disease-causing candidate.

    These results highlight Phevor's ability to use only a single affected exome to identify a mutation located in a known disease-associated gene and producing an atypical phenotype.

interpretation vaast Tool phevor prioritization • 4.6k views
Entering edit mode

Sounds great, where do I download the code? The website only has a login link. My institution and your federal government prevent the uploading of personal genomics data to a web-server. Human genetics researchers will need a downloadable package to run in-house.

Entering edit mode

Hi Karl, Phevor is a proprietary web-based program, so the code isn't available. We co-developed Phevor with the University of Utah, and we're currently deciding how we can integrate it into our toolset.

We're finalizing our licensing for Phevor, but we do provide academic licenses for the stand-alone command line VAAST variant prioritization tool.

As for the Phevor input, you can upload a variety of formats, including

  • a simple tab-separated file containing rank, HGNC symbol, and p-value
  • VCF files
  • VAAST files (.simple or .vaast), which contain a prioritized list of genes

If you contact me at mfeaster[at]omicia[dot]com, we may be able to help you figure out if Phevor or one of our other tools would be useful for you.

Would you mind telling me what you're studying?

Entering edit mode

I'm glad to hear it will take gene lists, and maybe I can construct a VAAST file. I should ask about server system loading: is it rude to run a dozen(1,000) scans?

I work with human birth defects. So the VCF is like a fingerprint, theoretically identifiable and HIPAA not allowed offsite. We need to look through ~10K mutations per exome, particularly at poorly understood genes. If I could use a gene-list instead of a VCF, it would be sufficiently anonymized.

The several well-known birth defect genes are not of interest to our research, so it's all about poorly annotated membranes and transcription factors, or sometimes arcane structural proteins. Right: if a gene is already annotated for the disease; then it is tested up front, and the full exome process is not performed. By the time data gets to me, it's because no known causative gene is clearly damaged, and we need to sift the exome for interesting hits.

My fears are that it's some kind of epigenetic or non-coding RNA causing a good proportion of diseases; invalidating genomic and mRNA searches entirely. So when a bio-informatic or machine-learning sort of system merges a bunch of databases to come up with an exciting conclusion; it could be just another false-positive. I'm sure any human pedigree under study will contain novel haplotypes. I could pick those out with a k-mer toolkit, but they're irrelevant vs known organogenesis pathways. And how do you handle segmental copynumber variation?

Do you have pathway integration ? With Genmapp-Cytoscape you can run a table of Boolean variables vs mRNA IDs (like differential expression or mutation-carrying) and it will produce a series of diagrams detailing affected flowcharts from That's the best I've got right now, it's fifty clicks and a bunch of gene symbols nobody wants to read. Every organ seems to have cartilage development problems, so that's not actually informative, statistically.

Entering edit mode

Hi Karl, sorry for the late response...

is it rude to run a dozen(1,000) scans?

Because this is a first version of the interface for this algorithm, each run is entered manually.

So the VCF is like a fingerprint, theoretically identifiable and HIPAA not allowed offsite.

The current Phevor web program doesn't take in VCFs, just gene lists with scores and a phenotype. We are integrating Phevor into our HIPAA-compliant Opal platform, so VAAST and Phevor analysis can be performed seamlessly.

For now, you can import VAAST analysis output files into Phevor. VAAST scores and ranks each gene based on likelihood of causing disease. VAAST analysis files only contain gene name, VAAST score, and rank.

Have you had a look at the VAAST and Phevor papers? I'm not sure which tools you're currently using to assess which genes are most damaged, but it's likely that VAAST provides a very different way of looking at the data and can provide candidates and prioritization absent from your current approach.

You can take download our papers here: Phevor | Opal ( with integrated VAAST1.0) | VAAST 2.0 (stand-alone)

We can go deeper to get all your questions (pathway integration, etc.) off-board... give me a shout at mfeaster[at]omicia[dot]com. We can help you get set up if you'd like to run through an analysis of VAAST and Phevor.


Login before adding your answer.

Traffic: 844 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6