How many SNPs are sufficient for Sample identify uniquely?
2
0
Entering edit mode
8.4 years ago
star ▴ 350

How may SNPs is need to create a panel to identify each individual uniquely, in the other hands used this SNPs as fingerprinting for around 7 billion people?

I know the characteristics of the SNPs that can be used to create a panel. My question is about the number of SNPs.

Thanks a lot

human-genetics SNP statistics • 3.0k views
ADD COMMENT
4
Entering edit mode

Your question is vague! What are you trying to do? What do you mean by 1) cover or 2) billion people alleles? Where these 500 SNPs come from?

ADD REPLY
3
Entering edit mode
8.4 years ago

this issue has been already addressed in forensics, where a reduced set of high-variability markers (STRs) can be enough to uniquely identify a subject. when dealing with SNPs you have the problem that they're not as diverse as STRs, so you need a few more markers to improve identification. but, in the end, a few tens of well-selected markers can be enough. have a look to this paper we worked in a few years ago, which proposes a set of 52 SNPs that could be used for this purpose. in fact we've used the main ideas of that paper to develop our own little (26 very common SNPs) Sequenom test for internal sample identification in our NGS pipeline, in order to avoid possible mixtures or miss-labelings, and it works great.

ADD COMMENT
1
Entering edit mode
8.4 years ago
H.Hasani ▴ 990

Hi and thanks for modifying the question, yet it is not concrete to help you!

Long story short, I'm guessing you are taking these 500 SNPs as to be close to the expected value of mixing all SNPs from the different populations, if so you need to add more constrains on your foreground and your background sets, e.g. are these SNPs disease associated, in coding regions, or/and in linkage disequilibrium, etc...again your question is really vague!

Mathematically, if my guess is correct, you may want to have a look at "Law of large numbers & sum of independence", depending on your aim, re consider your FG and your BG sets.

Generally pooling SNPs from mix populations is a mined road. It is very important to apply the same constrains to both sets while taking into account the individual properties of the populations (Comparing Snps Across Populations ).

ADD COMMENT

Login before adding your answer.

Traffic: 2715 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6