Question: Affy Axiom array - many cases of two probe set IDs sharing the same Affy SNP ID and dbSNP RS ID values. What's going on?
0
gravatar for devenvyas
3.9 years ago by
devenvyas570
Stony Brook
devenvyas570 wrote:

Hello,

I recently got back some genotyping data back from the core lab from the Axiom Human Orgins array, and I have been trying to analyze it (trying being the operative word). I have been having some major frustrations.

I was trying to run a PCA in R when I noticed that a few thousand of the rows were pairs sharing the same rs ID. (In case it is relevant, the table was constructed with rs IDs making up the rows with each column being a sample genotype coded as 0/1/2). I noticed this was the case within Genotyping console also (as well as when I export data out of it). There are many cases where there are two Probe Set IDs for the same Affy SNP ID and dbSNP RS ID values...

Anyone know why is this? Is this indicative of the exact same site being genotyped twice? How do I filter these cases out, so that when I export genotypes I am not getting duplicates of the same loci? Thanks!

affymetrix snp • 1.9k views
ADD COMMENTlink written 3.9 years ago by devenvyas570

First stop would be to check your annotation. Have you looked a few of the SNPs up on Netaffx (the Affymetrix support site, go to affymetrix.com) to see what their current official annotations are? I suggest you pull down the latest annotation file if you don't already have it to make sure that you don't have a mangled copy.
 

ADD REPLYlink written 3.9 years ago by David Quigley11k

I downloaded the annotation through Genotyping Console. When I started using the dataset, it asked me my Affy username (=academic email) and Affy password, and downloaded the latest files. I downloaded the annotation file manually via browser to confirm the issue, and the duplicated Affy SNP ID and dbSNP RS ID values exist there too.

David Reich's group helped design the array, so I looked up their published technical note (http://genetics.med.harvard.edu/reich/Reich_Lab/Publications_files/2012_Supp_Patterson_AncientAdmixture_Genetics.pdf) and found this

"For a very small fraction of sites, we found that the derived allele is different depending on which human is used in SNP discovery (these are potentially triallelic SNPs in the population, although they are not triallelic in the discovery individual). We keep such sites in our list of SNPs for designing, and use multiple probe sets to assay such SNPs."

But when I go through the annotation file, these duplicated probe sets are all bilallelic. Any clue what is going on?

 

AX-50582098 Affx-33256817 rs10760419 TRUE 2 9 T C T C
AX-50582099 Affx-33256817 rs10760419   2 9 T C T C
AX-50023441 Affx-3561055 rs10762231 TRUE 2 10 A G G A
AX-50023442 Affx-3561055 rs10762231   2 10 A G G A

 

ADD REPLYlink written 3.9 years ago by devenvyas570
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1867 users visited in the last hour