Question: Question about dbSNP rs #s
1
gravatar for devenvyas
4.7 years ago by
devenvyas570
Stony Brook
devenvyas570 wrote:

Summarized version:

I have an old data set from 2008 from a set of HumanCNV370-Quads, and I have downloaded relatively recent a set of extended-VCFs from the Altai Neanderthal and Denisovan genomes. I want to compare the data between the two. I know the genome coordinates for any given base can shift from assembly to assembly, but will the rsID for a given SNP change if and when the coordinate changes?

 

In-depth version:

I have SNP data from 64 samples at ~330,000 rs ids (I know there is no mt/Y data, I am pretty sure this is all autosomal). The data is from an old set of HumanCNV370-Quads from 2008. I don't have the genomic coordinates.

I have download two sets of VCF files from the Denisovan 30× and Altai Neanderthal 50× coverage genomes (available here http://cdna.eva.mpg.de/denisova/VCF/hg19_1000g/ and here http://cdna.eva.mpg.de/neandertal/altai/AltaiNeandertal/VCF/). These files are in a cumbersome extended VCF format described here (http://www.sciencemag.org/content/suppl/2012/08/29/science.1224344.DC1/Meyer.SM.pdf page 16) and here (http://www.nature.com/nature/journal/v505/n7481/extref/nature12886-s1.pdf page 14). These files have rsIDs labeled for most sites (of course though not all sites in these genomes have been assigned rsIDs).

I also have Illumina data from 171 samples (the libraries enriched for NRY- and mtDNA), which I am now have in raw, un-rsID-ed, unfiltered VCFs, which I am trying to bring into mix, but I am going to ignore them for now (I have a thread on them here https://www.biostars.org/p/110272/)

For the 330k sites, I have the alleles for the common chimpanzee from the 1000G (phase1_release_v3/20101123) and the dbSNP build 141 for most of the sites from a friend of a friend. The goal is to use f4 statistics to calculate Neanderthal ancestry estimates.

Anyways, to my question, would the rsIDs from the SNP chip still correspond to the rsIDs that I find in extended VCF files? If not, what would I have to do to make them match up?

 

snp dbsnp coordinates rs genome • 1.9k views
ADD COMMENTlink modified 4.6 years ago by Katie D'Aco1000 • written 4.7 years ago by devenvyas570
4
gravatar for Katie D'Aco
4.6 years ago by
Katie D'Aco1000
Massachusetts
Katie D'Aco1000 wrote:

rsID's will stay the same from genome build to genome build, even if the genomic coordinates change. The one gotcha to your plan I can think of is if the rsID was removed from dbSNP or merged with another rsID.

ADD COMMENTlink written 4.6 years ago by Katie D'Aco1000

Do you know of an easy way to update rsid's that have been retired?

ADD REPLYlink written 17 months ago by eric.kern13140
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 791 users visited in the last hour