Combination of genomic datasets genotyped using different arrays
Entering edit mode
7 weeks ago
desicasares ▴ 10

Hi everybody! I am conducting a GWAS in which one of the cohorts to be analyzed is composed of a dataset of cases genotyped with Global Screening Array v3, while the controls are public (dbGap) and genotyped with HumanOmni1-Quad_v1-0_B.

Since the overlap of the genotyped data is small (about 100k SNPs), I decided to impute cases and controls separately, and to merge them afterwards. It should be noted that both sets are from individuals with European ancestry and that the set of controls has been used in this same way before to join it with other GSA data without problems.

Once I impute the data (TOPMED), I clean it and merge it with the cases. When doing the PCAs, no population structure is seen and all individuals are homogeneously distributed. However, when doing the logistic regression, hundreds of signals appear above the GWAS-level (p-value<5e-08) and the QQ-plot indicates genomic inflation.

I would like to know if anyone has had this happen before when combining two sets of genotyped cases and controls with different arrays. Thank you in advance.

genotyping GWAS imputation inflation • 109 views

Login before adding your answer.

Traffic: 1300 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6