While trying to identify the alleles at several snp loci in a data set, I found a number of discrepancies between both the reference/variant alleles and the coordinates when comparing ucsc reference genome, refsnp on ncbi, and ensembl's dbsnp data, even though they allegedly all use the same reference genome hg19 or the equivalent gchr73. In most cases, the nucleotide at the hg19 coordinates corresponding to the ensembl snp is correct (though ancestral/variant is sometimes reversed). However, if I enter the rs# into UCSC using hg19, it consistently gives a different coordinate for the snp. Moreover, when I look up the same SNP on NCBI's refsnp, the nucleotides are completely different, as though a complementary strand were being used. Since rs# are not dependent on assembly (unlike their coordinates), this shouldn't be the case. I have appended some examples below, and would greatly appreciate a clarification:
rs1063192 ensembl grch37 coordinates = chr9:22003367 reference allele at these coordinates according to UCSC hg19 for chr9:2203367 = G ensembl ancestral/variant = G/A ucsc hg19 coordinates for rs1063192 = chr9:22003117-22003617 ncbi refsnp alleles C/T rs601620 ensembl grch37 coordinates = chr20:62309839 reference allele according to UCSC h19 for these coordinates:A ensembl ancestral/variant = A/G ucsc hg19 coordinates for rs601620 = chr20:62309589-62310089 ncbi refsnp alleles C/T rs498872 ensembl grch37 coordinates: chr11:118477367 reference allele at these coordinates according to UCSC hg91 = A ensembl ancestral/variant = G/A ucsc hg19 coordinates for rs498872 = chr11:118477117-118477617 ncbi refsnp alleles C/T
For comparison, one that is (mostly) consistent:
rs10079250 ensembl grch37 coordinates = chr5:14950132 reference allele according to UCSC hg19 for chr5:14950132 = C ensemble ancestral/variant = T/C ucsc hg19 coordinates = chr5:149449882-149450382 ncbi refsnp alleles C/T
Please note that the variants are not listed as ancestral/variant, but reference/alternative. The reference is just the base that was found in the individual from whom that contig of the genome was taken. It may be the minor allele, the non-ancestral allele or even the disease-causing allele, if that was the allele that individual had.
However your final example, rs10079250, indicates a variant where dbSNP do not follow this convention, as they have indicated the alternative allele first, followed by the reference. Note that in this case the T is in fact the major allele and the ancestral allele.