Question: Reference is deletion and negative genomic position label
gravatar for Shicheng Guo
24 months ago by
Shicheng Guo8.3k
Shicheng Guo8.3k wrote:

I found an interesting thing. rs71587061 is a insertion which means, in the reference genomic, It is a deletion, then how to descriptive the genomic position for this SNP?

dbSNP build 150 rs71587061
dbSNP: rs71587061
Position: chr1:45194431-45194430
Band: 1p34.1
Genomic Size: 0
Strand: +
Observed: -/C
Reference allele:   -

hg19 not hg38

reference ≠ major ≠ ancestral ≠ wildtype

snp • 390 views
ADD COMMENTlink modified 24 months ago • written 24 months ago by Shicheng Guo8.3k

querying b151 dbsnp vcf didn't yield any results.

 $ tabix 1:45194429-45194431
ADD REPLYlink written 24 months ago by cpad011213k

for GRCh38

 $ tabix 1:44728758-44728759
ADD REPLYlink written 24 months ago by Shicheng Guo8.3k
gravatar for Kevin Blighe
24 months ago by
Kevin Blighe61k
University College London
Kevin Blighe61k wrote:

Hey dude,

No, it's not a deletion in the 'typical' reference genomes used by the Genome Reference Consortium (GRC). This is an insertion variant that was identified in John Craig Venter's genome. It's not even listed in 1000 Genomes Phase III data. It's validation status is not listed on dbSNP. It may never have even been validated in Venter's actual genomic DNA and could be a sequencing error.

However, as Venter's genome is also regarded as a reference genome itself, you could make the argument that —yes indeed— it is a deletion in the GRC human (GRCh) builds. Switching the context around and regarding the GRCh builds as the references, this variant would then appear as an insertion in Venter's genome.

A problem that we have in comparative genomics is that the reference genomes prior to hg38 / GRCh38 were mostly based on the genomes of single individuals, including that of hg19 / GRCh37 (~70% of it was from some donor from Buffalo, New York, USA). So, weird situations like this variant's can arise, and also weird things like this: A: Alternate nucleotide is more frequent than reference nucleotide. OMG I'm dizzy.


ADD COMMENTlink written 24 months ago by Kevin Blighe61k
gravatar for RamRS
24 months ago by
Houston, TX
RamRS27k wrote:

It's an insertion. If you see the HGVS notation, it's NC_000001.10:g.45194430_45194431insC. The annotation source you got it from got their notations wrong.

ADD COMMENTlink written 24 months ago by RamRS27k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 720 users visited in the last hour