Question: Imputation on two genotyping datasets: should I do imputation separately? or merge the two datasets first?
gravatar for Tao
2.6 years ago by
Tao300 wrote:

Hi guys,

I'm doing eQTL analysis. The genotyping data are from two sequencing centers using same type of SNPs chip. But one center genotyping has a better SNPs call rate than the other one: ~100,000 more SNPs were called. I did QC on two datasets separately. QC would also cause some SNPs variance between the two datasets, while means some SNPs will be removed in one data set but won't in the other.

Now I am stuck on the imputation step. Should I do imputation separately and combine the two imputed genotyping data sets for later eQTL? or first combine the two QCed genotyping data sets and do imputation together? I don't know much about the principles of genotyping imputation, so hope someone can help me on this. Thanks!


genotyping eqtl imputation snps • 1.2k views
ADD COMMENTlink written 2.6 years ago by Tao300

For this question, in case someone would have similar situation, I'd like to answer by myself. In GTEx (v6p) protocol, they use two different genotyping array: OMNI 5M for pilot phase and OMNI 2.5M for Mid-phase. They first downsized the 5M to 2.5 M portion of variants, and then did QC and imputation. But I think the other way is also feasible when you find there is only a small portion of common variants, maybe because different array platform or manufacturer. That's what I adopted. I did QC for each genotype batches and then merged them after imputation.

ADD REPLYlink written 19 months ago by Tao300
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1290 users visited in the last hour