Entering edit mode
6.6 years ago
iCode
•
0
I want to join my variant data (from 23andMe) with clinvar and I see people Join them differently. Specifically, these are the join on clauses
sample.chromosome = clinvar.chromosome
AND sample.startposition = clinvar.startposition - 1
AND sample.endposition = clinvar.endposition
AND sample.referenceallele = clinvar.referenceallele
that was used in this blog: https://aws.amazon.com/blogs/big-data/interactive-analysis-of-genomic-datasets-using-amazon-athena/
As you see, clinvar.startposition has a minus 1 (clinvar.startposition - 1) but I am seeing others are not doing so and just say
...
AND sample.startposition = clinvar.startposition
...
What is the proper way to join my variants with clinvar. should I have that minus 1 on the startposition or not? and why?
it depends how data have been inserted. Cheat Sheet For One-Based Vs Zero-Based Coordinate Systems
Thank you. Would you have happened to know what Coordinate Systems 23andMe api data and Clinvar data are in?