Understanding my SNP data
1
0
Entering edit mode
9.2 years ago
jondank • 0

Hi,

Over the past dozen years I have had several SNP studies conducted on my family. I am having a difficult time understanding how to compare my data to published studies about particular SNPs and their impact on risks for specific health issues. My peeves:

  1. Different labs report results based on differing assumptions as to what the ancestral nucleotide is. Is there a central authority that declares for the purpose of standardization what the ancestral nucleotide is for each SNP? Genetics labs need to report only against a centralized standard.
  2. The second problem is most disconcerting: I read a paper that says they observed G/A and they declare the ancestral allele as G and the frequency. When I look at my actual results I see the lab reports the choices are C/T and we have TT's in our actual results. I look at the dbSNP and I see reports of both G/A and C/T. WTF? What am I missing? Is this another case of labs picking one set of results versus what researchers choose to reference? What am I missing?

Jon

SNP • 3.0k views
ADD COMMENT
2
Entering edit mode
9.2 years ago
  1. Generally speaking, the reference allele should be whatever is in the reference genome at a given position. The reference genome, then, is made by the GRC. Having said that, if an array was designed around hg18 and you're using hg19 or GRCh38 coordinates, then the annotation for the array might not always match as expected. There's also the issue of strand, where if an array reports the - strand sequence then that will necessarily be the reverse complement of the + strand.
  2. G->A on the + strand is C->T on the - strand, so ensure that both are talking about the same strand. That's the simplest explanation of this, though this isn't the only one.
ADD COMMENT
0
Entering edit mode

So if you have TT, it means you have the homozygous T allele

ADD REPLY
0
Entering edit mode

Thanks.

  1. As long as a paper uses G/A, C/T nomenclature I can figure out my SNP versus the risk SNP. I get into trouble when the discussions are only in terms of + - or Bb...
  2. I like the simple answer.
  3. 23andMe reports sbSNP Orientation . Does his have anything to do with + or - strand?
ADD REPLY
0
Entering edit mode

23andMe usually reports the alleles as A/G or C/T.....but sometimes they report the alleles as G/T or A/C or C/G. What's going on? Is this due to selecting the "other" strand when analyzing the other parent chromosome?

ADD REPLY
0
Entering edit mode

You'd have to ask 23andMe.

ADD REPLY
0
Entering edit mode

I think they are legitimate. If we consider a pair of choices available for each location as (ancestral allele, risk allele) then there are 2^4 = 16 such pairs. 4 of these are degenerate/redundant (A/A, C/C,G/G, T/T) which leaves 12. Six of these are the complements as in T/G is the complement of A/C. This leaves 6 unique sets of alternatives: A/C, A/G, A/T, C/A, C/G and C/T. It is possible that some labs might report the complement alternatives: T/G, T/C, T/A, G/T, G/C, and G/A.

ADD REPLY

Login before adding your answer.

Traffic: 1889 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6