Question: What are the different ways of referring to a genetic variant/ mutation?
gravatar for dk
3.9 years ago by
Sri Lanka
dk10 wrote:

There are different genetic variant types such as Indels, SNPs, insertions/deletions etc. 

In ClinVar database, these variants are given a long name. Examples -  NM_172201.1(KCNE2):c.79C>T (p.Arg27Cys). But in other instances like in publications and clinical reports, this full name is not used to refer to that variant.  Can I take the gene name (KCNE2) and genetic code change (c.79C>T) together? What are the other possible ways to refer to a variant? Can I take the position (p.Arg27Cys) as well? I want this to extract information about genetic variants from publications and clinical  reports. 

variants snp next-gen sequence gene • 1.2k views
ADD COMMENTlink modified 3.9 years ago by Vivek2.3k • written 3.9 years ago by dk10
gravatar for Devon Ryan
3.9 years ago by
Devon Ryan91k
Freiburg, Germany
Devon Ryan91k wrote:

While it's convenient to keep track of the gene name, the transcript ID (e.g., NM_172201.1) is actually needed, since each gene may have multiple transcripts associated with it. The protein-level change is also nice to keep track of, though make sure you keep the associated protein ID (NP_751951.1 in this example), since otherwise you won't always know which isoform is changing.

The actual literature is a total mess in this regard. People are supposed to follow the nomenclature guidelines, but they don't always and the guidelines themselves are not always particularly clear. So, this will be a somewhat painful process, make sure to manually spot-check some of the entries!

ADD COMMENTlink written 3.9 years ago by Devon Ryan91k
gravatar for Vivek
3.9 years ago by
Vivek2.3k wrote:

The first notation you got from Clinvar is the recommended HGVS nomenclature and the most reliable, responsible way to report mutations in literature. You should ideally have the CDS position, allele change, transcript name and version to reliably get the genomic coordinates of your mutation.

Quite often, especially in older publications, this will not be the case, in such cases, if you want to be thorough, you could check the mutation impact at the CDS position and allele change in all transcripts of the gene in Refseq to see which functional consequence best matches the variant being reported.

Getting the genomic coordinates from just the protein change is even harder, I usually reference a database like dbNSFP which has prebuilt annotations for all possible non-synonymous mutations in the human genome to see if a protein change within a gene matches the mutation signature.


ADD COMMENTlink modified 3.9 years ago • written 3.9 years ago by Vivek2.3k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1571 users visited in the last hour