How does SNP position numeration work?
2
0
Entering edit mode
6.8 years ago
eyb ▴ 210

Is counting of each SNP position starts from the beginning of each chromosome? Or it is continuous?

For example

In my dataset first snp for chromosome 2 rs2685230 has a position 437664 and last SNP rs10191556 has a position 242521405

But first SNP at chromosome 13 rs11617984 has a position 19622143

Is this a mistake in my data, or it means that 13 chromosome has a first SNP at 19622143? Is it supposed to be shorter than chromosome 2?

position bp SNP • 1.4k views
ADD COMMENT
3
Entering edit mode
6.8 years ago

The position is the position within each chromosome. Actually using a continuous numbering system would be a nightmare. Have a look at the VCF spec for further details.

ADD COMMENT
0
Entering edit mode

So on a chromosome where is nucleotide number 0 or number 1. Is it on the very tip of the telomere? Is is the p telomere or the q arm telomere? So does the chromosomal position simply step up by 1 for each nucleotide marching through the centromere to the other arm?

ADD REPLY
1
Entering edit mode

The first base is at one of the ends (assuming non-circular DNA). Which of the two ends isn't always known. For mouse/human/etc. (i.e., high quality reference genomes), the first base is the tip of the p-arm. Yes, each subsequent base is one position higher.

ADD REPLY
2
Entering edit mode
6.8 years ago

The ID's of the SNP are just a primary key in the dbsnp database (e.g: http://dev.mysql.com/doc/refman/5.0/en/example-auto-increment.html ) . The SNPs occur where the regions have been studied, there is no relationship with the length of the chromosomes. Furthermore, the chr13 is telocentric:

$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr2.fa.gz" | gunzip -c | uniq -c | head -n 10
      1 >chr2
    200 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
      1 CGTATCCcacacaccacacccacacaccacacccacacacacccacaccc
      1 acacccacacacaccacacccacacaccacacccacacccacacaccaca
      1 cccacaccacacccacacaccacacaccacacccacacccacacacacca
      1 cacccacacaccacacccacacacaccctaaccctaacccctaaccccta
      1 accctaaccctacccgaaccctaaccctaaccctaacccctaaccctaac
      1 ccctaaccctaaccctaaccgtaaccctaaccctttaccctaacccgaac
      1 ccctaacccctaacccctaacccttaaccctaacccttaaccctgaccct
      1 gaccctgaccgtgaccctgaccctaacccgaacccgaacccgaaccccga



$ curl -s "http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/chr13.fa.gz" | gunzip -c | uniq -c | head -n 10
      1 >chr13
 380400 NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
      1 GAATTCAACTGCCTCCCTGGCCTTTCCCTGCCAATCTATGTGCCCCAGCA
      1 GCCAACTTACATAGCACTGAGTGCAGACTTGTAAATAGACCTTCCAGTTC
      1 TGCTATAATCAATACCTTATTGTCCATAACTCAATTTGGAGAAGGTTTAG
      1 CTGTCTACCAACTCTTGTGGAGAGTTTCTGTGAAGTTTTGTTTTGGGTTG
      1 CAAGAATCTGGAAAACAGATGCAGATGTTTTTGAGGAAGATTTTGAAATT
      1 TCTACTTATAAGGTACCCAAAATGGGATCCAAACTCTTGAATTTGGTTGA
      1 TCTTCTGAAATACATACCTGTGTTTTAAGATTTGCTTGAGCAAACCTTTA
      1 ACCATGGAAATTTTAACCAATGATTTCCAGGTTGAAACAATTCCAGTTTT

so you won't find any SNPs in the 5' region of this chromosome.

ADD COMMENT

Login before adding your answer.

Traffic: 2131 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6