Question: How accurate is the IBD calculation by plink?
2
gravatar for MAPK
2.9 years ago by
MAPK1.4k
United States
MAPK1.4k wrote:

I was trying to calculate the IBD values for about 100 individuals all likely to be unrelated. I tried to use plink tool ( http://pngu.mgh.harvard.edu/~purcell/plink/ibdibs.shtml ), but looks like it generates to many false positives (or high IBDs for unrelated individuals). I have one sample with at least 5 other samples with IBD =1 (I am looking at Z0 values). Can someone please explain me what these values mentioned in their website are:

Z0  P(IBD=0)
Z1  P(IBD=1)
Z2  P(IBD=2)
PI_HAT  Proportion IBD, i.e. P(IBD=2) + 0.5*P(IBD=1)
ibd plink • 4.8k views
ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by MAPK1.4k

It's possible that your unrelated individuals are actually related, or sample swaps?

ADD REPLYlink written 2.9 years ago by Matt Shirley8.9k

It's also known that PLINK's IBS calculations aren't that great. The kcoeff paper has some comparisons.

ADD REPLYlink written 2.9 years ago by Matt Shirley8.9k

Have you carefully QC'ed your genotypes like what you would do for GWAS analysis? Poor quality genotypes would give you wrong calculations, but it's not the fault of IBD.

ADD REPLYlink written 22 months ago by Zhenyu Zhang240
5
gravatar for leekaiinthesky
2.9 years ago by
UCLA
leekaiinthesky170 wrote:

These are not false positives!

In fact, they are not positives at all. As you yourself wrote, Z0 is the probability that at a given locus 0 alleles are identical by descent. In other words, if your samples are unrelated, you should expect a Z0 close to 1.

PI_HAT is a measure of overall IBD alleles. If your samples are unrelated, you should expect a PI_HAT close to 0.

Z0, Z1, and Z2 segregate out the probabilities of having IBD of 0, 1, or 2 over the loci, which gives you a way of discriminating between relationship types. Ideal parent-offspring has (Z0, Z1, Z2) = (0, 1, 0), i.e. all loci have one allele identical by descent; ideal full sibling = (1/4, 1/2, 1/4), i.e. 25% of loci have 0 alleles IBD, 50% have 1 allele IBD, 25% have 2 alleles IBD; etc.

ADD COMMENTlink modified 2.9 years ago • written 2.9 years ago by leekaiinthesky170

Thanks. So do I need to compare PI_HAT to get the actual relationships between the individual which is supposedly between 0 to 1?

ADD REPLYlink written 2.9 years ago by MAPK1.4k

Yes, PI_HAT is a summary statistic that will give you overall IBD proportion. But Z0, Z1, and Z2 are also helpful to understand for distinguishing between relationship types, so it's useful to take the time to understand what all four measures mean.

ADD REPLYlink written 2.9 years ago by leekaiinthesky170

Thanks, but Pi_HAT values don't make sense at all (unless I am doing something wrong). I am getting 0 for the same individuals, where it is supposed to be 1 (IBD=1 , when compared to same or monozygotic individuals?)

ADD REPLYlink written 2.9 years ago by MAPK1.4k
1

You may be confusing Z0, Z1, Z2, and PI_HAT. First take some time to understand their relationship.

ADD REPLYlink written 2.9 years ago by leekaiinthesky170

Also, I am using only 27000 SNPs (LD pruned and quality filtered) for 150 samples. Do you think the number of SNPs is the issue here?

ADD REPLYlink modified 2.9 years ago • written 2.9 years ago by MAPK1.4k

Supposedly, that should be enough.

ADD REPLYlink written 2.9 years ago by leekaiinthesky170
1
gravatar for Matt Shirley
2.9 years ago by
Matt Shirley8.9k
Cambridge, MA
Matt Shirley8.9k wrote:

If you want an independent method to compare to I suggest trying kcoeff which estimates k0, k1, and k2 which are the portion of the genome shared IBS0/1/2.

ADD COMMENTlink written 2.9 years ago by Matt Shirley8.9k

Thanks. I am getting IBD = 1 for 1 sample with multiple samples. So this can't be true unless the samples are duplicated.

ADD REPLYlink written 2.9 years ago by MAPK1.4k
1

You should be able to tell if the samples are exactly duplicated by looking at the data. Otherwise, they might have been duplicated during sample handling before the genotyping.

ADD REPLYlink written 2.9 years ago by Matt Shirley8.9k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2198 users visited in the last hour