Question

How to convert genetic coordinates to HGVS format?

0

Entering edit mode

6.7 years ago

lmelnik113 • 0

Hello,

I'm trying to analyze exome data in a .vcf file with exome. Every row in this file has a column with the chromosome, position, ref sequence and alt sequence as well as a ton of other information, i.e. chr14 105258893 A G

How can I go about transforming this data to the HGVS format such as NM_? For example, if I search for chr14 105258893 on Google, this link from dbSNP comes up: https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=rs2494749

Here I can see that this SNP is labeled rs2494749 and on the right I see every HVGS annotation, i.e. NM_001014432.1:c.46+42T>C. If I then type NM_001014432.1:c.46+42T>C into a service like Varsome, I can see a TON of information: https://varsome.com/variant/hg19/NM_001014432.1%3Ac.46%2B42T%3EC - this link shows me the original mutation from A to G, the chromosome, and the original position as well all sorts of cool stuff.

I'm looking for a package in Python (preferably) or R (if I have to) that I can use to transform chromosome number, position, ref and alt into this HGVS format without all this tedious Google searching. Most of the info I've found only formats existing HGVS annotations. I've browsed Biostars and found people trying to do the reverse of what I'm asking - converting HGVS back to genetic coordinates.

Also I'm wondering why one single variant has so many different annotations? Is one better or more informative than the other? If I'm trying to get a single annotation out of this mapping, which one do I choose?

Thank you!

next-gen SNP hgvs • 6.4k views

ADD COMMENT • link 6.7 years ago by lmelnik113 • 0

score 3 · Answer 1 · 2017-08-08

Variation reporter (NCBI), VEP (Ensembl) and SNPeff provides HGVS notations as part of VCF annotation. This annotation includes gSyntax (chromosomal/ contig), cSyntax (transcript) and pSyntax (protein). There are few python scripts like this. Try them, but they would generate only gSyntax. If you are comfortable using APIs, I would suggest mutalyzer API. For all 3 syntax, (g,c, and p) you would need to annotate VCF.