Ibd (Identity By Descent) And The Chosen Value Of Pi_Hat
1
8
Entering edit mode
7.8 years ago
eXpander ▴ 100

I try to understand how to chose the optimized pi_hat parameter for a dataset. In many articles, they chose 0.2 as pi_hat, and everything above that is considered to be cryptic relatedness or duplicates.

I've tested IBD on HapMap, the files I use can be found here: ftp://ftp.ncbi.nlm.nih.gov/hapmap/genotypes/2009-01_phaseIII/plink_format/. I first remove all annotated offspring from HapMap. Then I peform IBD to see if it still finds samples with cryptic relatedness to each other. The steps I peform are the following (in PLINK):

1) LD-prune:

plink --file hapmap --indep-pairwise 50 5 0.2
plink --file hapmap --extract plink.prune.in --recode --out hapmap_pruned

(2) IBD:

plink --file hapmap_pruned --genome --min 0.2


The results shows that many cryptic related samples can be found with a pi_hat of 0.2 as threshold, even if all offspring were initially removed. My question is, is this a normal behavior? Or should one increase the pi_hat? How to find out a "good" pi_hat for a custom dataset?

plink • 20k views
2
Entering edit mode

See this post How large would inbreeding coefficient be to be anomalous? "Third-degree relatives are 12.5% equal IBD (Pihat 0.125)"

10
Entering edit mode
7.8 years ago

This is normal behavior for the HapMap data set. See Stevens et al. and a few of the earlier papers he cites that try to identify un-annotated relationship in HapMap. The Stevens paper is using Cotterman coefficients (K1,K2) which are fraction of the genome shared IBD1, and IBD2. As zx8754 mentioned above, 0.125 is third degree, 0.25 is second degree, and 0.5 is first degree, although in practice these thresholds can be too low or high due to consanguinity, and admixture.

0
Entering edit mode

Hi Matt Shirley, I am also confused in choosing the best IBDs value to remove relatedness between individuals. I agree with your comment, but what about the plink as they mention to remove the unrelated individuals having Pi-HAT value greater then 0.05. The original wording from Plink website if given below,

"Scan the plink.genome file for any individuals with high PIHAT values (e.g. greater than 0.05). Optionally, remove one member of the pair if you find close relatives. (Alternatively, to keep them in but just exclude this pair from the segmental analysis, see below)"

2
Entering edit mode

If you need to remove all unexpected relatedness, even at a 4th or 5th degree (PIHAT = 0.0625 or 0.03125) level, then 0.05 is a good threshold. The PLINK documentation you are referring to is telling you that you only need to remove one member of a pairwise comparison that results in PIHAT (proportional IBD).

0
Entering edit mode

Hi Matt, It's too late, but could you please kindly let me know if IBD proportion (PIHAT) is the same with kinship coefficient?

Thanks

1
Entering edit mode

The two measures are quantifying the same thing: the percentage of a genome that is shared by decent with an ancestor. However PIHAT and Kcoeff can be calculated using different window sizes for consecutive informative SNPS, or can be calculated with or without CNV information, which can help discriminate between consanguinity and autozygosity. You can compare the two values as long as you're sure the methods used to calculate them are similar (e.g. PLINK or PLATO, it's been a while since this was my research area so I don't know the current state of the art).

0
Entering edit mode

Hi Matt,

Can you please let me know if I want my Pihat < 0.1 Can I set in plink --rel-cutoff 0.1, or in plink2 --king-cutoff 0.1 ? If not what would be the way to do this?

Thanks