Question: Is Clinvar the most comprehensive and reliable source of disease-causing genes and variants?
gravatar for Biomed
18 months ago by
Bethesda, MD, USA
Biomed4.6k wrote:

I am interested in finding a comprehensive (SNV +CNV) and reliable list of all known disease-causing genes and preferably also variants in those genes. Clinvar comes to mind as an obvious option but is it really the best source out there or should one combine the information found in clinvar with other data sets for a most comprehensive set? Thanks.

ADD COMMENTlink written 18 months ago by Biomed4.6k

Not really, none of them is 100% accurate or restrict themselves to only clinically proven variants. The best I've seen is HGMD, might be of use to you.

ADD REPLYlink modified 18 months ago • written 18 months ago by RamRS25k

Was also going to mention HGMD. However, to use recent versions of it requires a licence (pay only). Also, your question is premature... When one considers the fact that we simply don't know the exact role of the vast majority of genetic variants in relation to disease, there cannot yet exist a database that is all-encompassing in the sense that you appear to want.

ADD REPLYlink written 18 months ago by Kevin Blighe53k

+1 on the "cannot know" comment. HGMD lags by 3-6 months between Pro and Free versions, but the difference is not too much and can be addressed by scanning recent papers. It does depend on the number of genes under study though. Whole Genome reconciliation would be near impossible.

ADD REPLYlink written 18 months ago by RamRS25k

The question is about known-to-be disease-causing genes and variants like a nonsense variant in an ACMG 59 gene. I am not interested in genes and variants of unknown significance. I hope this clarifies the question. Thanks for the comment.

ADD REPLYlink written 18 months ago by Biomed4.6k

From my experience, pathogenicity of a mutation/variant is a dynamic, changing annotation. We do not truly know which are definitely damaging, but by current standards, ClinVar + PolyPhen2 + HGMD will be the strongest evidence you will get that a variant is definitely damaging.

It is easier to determine that a variant is not damaging, but much more difficult to say for sure it causes X or Y phenotype.

ADD REPLYlink modified 18 months ago • written 18 months ago by RamRS25k

I agree again with Ram. This is very much a 'work in progress'. The ACMG have done a lot to attempt to define pathogenicity but I think that it's an impossible task, currently, because we don't have the information at our disposal such that we can say with 100% certainty that this or that variant will be pathogenic. Also, we can sequence germline DNA from B lymphocytes, for example, and discover a whole bunch of previously reported 'pathogenic' variants, but then these may have minimal relevance in our tissue of interest. Also, they may have the highest scores among the in silico prediction tools, but actually not prove 'damaging' at all. How do we even define 'damaging' and 'pathogenic' when most diseases / phenotypes are determined by very complex genetics that we do not yet understand? We are still even attempting to define how to correctly annotate splice isoforms, and we just don't yet have data on expression in different tissues.

Some well-studied cases are out there, though, such as germline variants in TP53 resulting in Li Fraumeni Syndrome, BRCA1 germline variants and variants upstream of CCND1 and breast cancer susceptiility, ORMDL3 and asthma susceptibility, etc. Even in many of these well-studied cases, though, we have not even properly defined the disease mechanisms at play.

I write so much on this because I have a review coming out on this topic.

ADD REPLYlink modified 18 months ago • written 18 months ago by Kevin Blighe53k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1722 users visited in the last hour