Question: How many SNPs are sufficiant for Sample identify uniquely?
gravatar for star
5.3 years ago by
star270 wrote:

How may SNPs is need to create a panel to identify each individual uniquely, in the other hands used this SNPs as fingerprinting for around 7 billion people?

I know the characteristics of the SNPs that can be used to create a panel. My question is about the number of SNPs.


Thanks a lot.


ADD COMMENTlink modified 5.3 years ago by Jorge Amigo12k • written 5.3 years ago by star270

Your question is vague! What are you trying to do? What do you mean by 1) cover or 2) billion people alleles? Where these 500 SNPs come from?

ADD REPLYlink written 5.3 years ago by H.Hasani980
gravatar for Jorge Amigo
5.3 years ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

this issue has been already addressed in forensics, where a reduced set of high-variability markers (STRs) can be enough to uniquely identify a subject. when dealing with SNPs you have the problem that they're not as diverse as STRs, so you need a few more markers to improve identification. but, in the end, a few tens of well-selected markers can be enough. have a look to this paper we worked in a few years ago, which proposes a set of 52 SNPs that could be used for this purpose. in fact we've used the main ideas of that paper to develop our own little (26 very common SNPs) Sequenom test for internal sample identification in our NGS pipeline, in order to avoid possible mixtures or miss-labelings, and it works great.

ADD COMMENTlink modified 5.2 years ago • written 5.3 years ago by Jorge Amigo12k
gravatar for H.Hasani
5.3 years ago by
Freiburg, Germany
H.Hasani980 wrote:

Hi and thanks for modifying the question, yet it is not concrete to help you!

Long story short, I'm guessing you are taking these 500 SNPs as to be close to the expected value of mixing all SNPs from the different populations, if so you need to add more constrains on your foreground and your background sets, e.g. are these SNPs disease associated, in coding regions, or/and in linkage disequilibrium, etc...again your question is really vague!

Mathematically, if my guess is correct, you may want to have a look at "Law of large numbers & sum of independence", depending on your aim, re consider your FG and your BG sets.

Generally pooling SNPs from mix populations is a mined road. It is very important to apply the same constrains to both sets while taking into account the individual properties of the populations (Comparing Snps Across Populations ).

ADD COMMENTlink modified 14 months ago by Ram32k • written 5.3 years ago by H.Hasani980
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1975 users visited in the last hour