GWAS QC step - Heterozygosity
1
0
Entering edit mode
3.0 years ago
jun0914 ▴ 10

Hi, I'm doing QC step with genetic data before doing imaging genetics study.

I use plink version 1.09.

I calculate heterozygosity rate to exclude individuals that has 3SD from mean value.

By calculating (N(NM)-O(HOM))/N(NM), I was able to get 'het', which is heterozygosity rate. The result is below table.

heterozygosity calculated

I filtered 3SD away from mean value, and this sorted out 113 subjects.

But I realized that each population(White,Black,Hispanic,Asian,Others) have different distribution of heterozygosity rate clustered, and about 80 people of excluded subjects were Asian.

histogram of each population heterozygosity rate

Here are my questions.

  1. Do I have to seperate population before performing any QC?
  2. If not, do I have to just remove 80 asian, which is about half of full asian population?

Thank you.

heterozygosity GWAS PLINK ImagingGenetics • 1.5k views
ADD COMMENT
0
Entering edit mode
3.0 years ago
  1. When performing a QC step that uses allele frequencies, yes, it might be useful to split by population. However, that isn't a fully general solution: what do you do about people with mixed ancestry? So it's reasonable to just use a looser QC cutoff instead.
  2. Your instincts are sound: it does not make sense to throw out half of your asians. If you aren't splitting by population, I'd go with a 6-10 SD cutoff; 3 SD is definitely too strict.
ADD COMMENT
1
Entering edit mode

Thanks a lot! I decided to go with less strict cutoff, and found out that I can save 100 subjects(including 80 asian) by applying 4SD cutoff instead. This sounds reasonable.

ADD REPLY

Login before adding your answer.

Traffic: 3114 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6