Mouse Strains in mm10/GRCm38 dbSNP 142?
2.9 years ago

I am using the pre-built HiSat2 index for mm10 with the snps. Looking at the script the devs provide, I can see they are using SNPs from UCSC golden path.

However, I would like to know which strains are included for this release. I cannot find any documentation online from my searches. Should I assume it's the same strains used in the MGP?

2.9 years ago
igor 13k

The UCSC Genome Browser tracks are described in the Table Browser section. Specifically, snp142Common:

This track contains information about a subset of the single nucleotide polymorphisms and small insertions and deletions (indels) — collectively Simple Nucleotide Polymorphisms — from dbSNP build 142, available from Only SNPs that have a minor allele frequency of at least 1% and are mapped to a single location in the reference genome assembly are included in this subset. Frequency data are not available for all SNPs, so this subset is incomplete.

The selection of SNPs with a minor allele frequency of 1% or greater is an attempt to identify variants that appear to be reasonably common in the general population.

Unlike MGP, which is a very defined sequencing project, dbSNP is a database that accepts submissions from anyone. Thus, there is no guarantee that it is limited to specific strains.


