Question: Heterozygosity by individual
0
gravatar for Li
4.7 years ago by
Li0
Li0 wrote:

How to compute heterozygosity for each individual?

Currently I use:

plink --bfile file --hardy --out file --noweb

but I need .hwe files for each individual separately in a dataset of hundreds of individuals.

individual plink hardy • 5.0k views
ADD COMMENTlink modified 17 months ago by Johan Zicola60 • written 4.7 years ago by Li0
1
gravatar for chrchang523
4.7 years ago by
chrchang5237.7k
United States
chrchang5237.7k wrote:

--hardy requires many individuals to judge deviation from Hardy-Weinberg equilibrium. Single-individual .hwe files will not be useful.

If you are trying to perform quality control on samples, consider plotting top principal components (computed with EIGENSOFT 6, or plink --pca) and removing extreme outliers.

ADD COMMENTlink written 4.7 years ago by chrchang5237.7k

I need to compute heterozygosity per individual from genotype data. Not for quality control.

ADD REPLYlink written 4.7 years ago by Li0

Ah. --het (https://www.cog-genomics.org/plink2/basic_stats#ibc ) should come in handy, then.

ADD REPLYlink written 4.7 years ago by chrchang5237.7k

Thanks. Two problems:

  1. --het computes observed and expected autosomal homozygous genotype counts for each sample. However, I need heterozygosity for the X chromosome too (separately). --chr 23 --het gives zero o(hom) and e(hom). My sample is females only.

  2. --het gives an observed heterozygosity of 40% for my example individual, while if I calculate the average o(het) of autosomal snips of the same individual from a .hwe file, o(het) is 30%. For expected heterozygosity the numbers are 36% vs 16%. Thus, the two methods yield very different answers, although if I understand correctly, they should measure the same thing.

As a context: the goal is to compare heterozygosity of X and autosomes between groups, and in order to do a t-test, I need X and autosomal het per individual. I already have average X and autosomal het per group.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Li0

It's a dirty hack, but if your .bim file has numeric chromosome codes, you can force the X chromosome to be treated like an autosome by specifying a species with more chromosomes (e.g. "--dog").

ADD REPLYlink written 4.7 years ago by chrchang5237.7k

That works, but I still need to compute this using --hardy too, per individual. Creating a text file with one person, and using --keep myfile.txt is not possible because of the high number of individuals.

ADD REPLYlink modified 4.7 years ago • written 4.7 years ago by Li0
0
gravatar for Johan Zicola
17 months ago by
Johan Zicola60
Johan Zicola60 wrote:

VCFtools has the function --het such as vcftools --vcf input.vcf --het --out output.het

--het

Calculates a measure of heterozygosity on a per-individual basis. Specfically, the inbreeding coefficient, F, is estimated for each individual using a method of moments. The resulting file has the suffix ".het".

Check the full manual: https://vcftools.github.io/man_latest.html

Check this post if you have difficulty to interpret the results: Is the heterozygosity flag (--het) in vcftools calculate observed and expected heterozygosity?

ADD COMMENTlink written 17 months ago by Johan Zicola60
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1601 users visited in the last hour
_