How Large Would Inbreeding Coefficient Be To Be Anomalous?
2
3
Entering edit mode
11.4 years ago
nnlnn ▴ 60

I checked inbreeding coefficient (F) on my samples (around 200) using plink

plink --file mydata --het

and found the distribution of F pretty symmetric:

enter image description here

and the short summary:

> summary(het$F)
    Min.  1st Qu.   Median     Mean  3rd Qu.     Max. 
-0.35990 -0.07199  0.01568  0.02755  0.13460  0.40790

The F values are quite big (either positive or negative), which made me worry about sample quality. Presumably they are all unrelated. Can anyone tell me whether this distribution of F indicates some problems and what they are? Or, is there a rule of thumb about what F value is considered normal for unrelated, cleaned samples? Thanks!

plink • 11k views
ADD COMMENT
6
Entering edit mode
11.4 years ago
zx8754 11k

For relatedness try IBS/IBD estimation:

PI_HAT:

  • Identical twins, and duplicates, are 100%identical by descent (Pihat 1.0)
  • First-degree relatives are 50% IBD (Pihat 0.5)
  • Second-degree relatives are 25% IBD (Pihat 0.25)
  • Third-degree relatives are 12.5% equal IBD (Pihat 0.125).
ADD COMMENT
1
Entering edit mode

Hi, I am wondering how to test if two individuals are 1st degree relatives or 2nd degree relatives if the Pihat is between 0.5 and 0.25.. Is there any statistical testing method? Hoping to hear from you

ADD REPLY
0
Entering edit mode

Thanks, but I am afraid I was asking a different question. The inbreeding coefficient is per-individual, not for pairs of individuals as IBD is.

ADD REPLY
1
Entering edit mode

Inbreeding coefficient (F_ped) is the same thing as kinship coefficient between the parents.

ADD REPLY
3
Entering edit mode
11.4 years ago
jxchong ▴ 160

IMO you've got an enormous problem: F of 0.25 would be the offspring of a father/daughter or mother/son or brother/sister pairing. Even in the data I've seen (for a founder population), F calculated by PLINK is <50% of your max and min values.

This distribution of F values indicates problems (both extreme excess of homozygosity and extreme deficiency of homozygosity). The deficiency of homozygosity could be due to sample contamination (enriching for heterozygous sites). I'd have to think a little harder about the cause of the extreme excess of homozygosity.

Have you already QCed the data for the basics? You say "cleaned" but I'm not sure what you mean. Have you done sex check, removed SNPs that deviate excessively from HWE, remove SNPs with low CR, remove individuals with low CR? If you have a lot of bad SNPs that are biased towards homozygous calls, you could potentially generate data with excess homozygosity.

ADD COMMENT

Login before adding your answer.

Traffic: 1745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6