TOPMed imputation
0
0
Entering edit mode
19 months ago
haasroni • 0

Hi everyone,

I am performing SNP imputation starting with genotypes from a global screening array - Illumina's product, using the TOPMed imputation server. The server uses Eagle for phasing and minimac4 for imputation. I did not perform any pre-imputation quality control (e.g. Hardy-Weinberg tests, minor allele frequency cutoffs).

The purpose of this imputation is to impute a specific set of SNPs that are part of an existing polygenic risk score (not to perform a GWAS). These SNPs are all confirmed to be registered in thousand-genomes. So there are ~300 SNPs that I need to impute with high quality and the rest don't matter to me very much.

This is my first imputation experiment and I'm not sure how to interpret the quality of the result, and whether my results are typical or unusual. I am primarily looking at R-squared statistics reported by minimac4.

Here are summaries of two statistics in the imputation results:

Empirical R squared

R squared of only imputed genotypes

I have two lines of inquiry:

  1. Are these quality distributions typical?

  2. Out of the 236 untyped PRS SNPs that needed to be imputed, only 39 appeared in the imputation results, and only 5 had an imputation quality over 0.5. What could be a reason for most PRS SNPs not appearing in imputation results, and could there be a reason for why the imputed PRS SNPs have such low imputation quality?

Does this indicate that the quality of the input data is low? Or maybe there are some other causes that can be controlled? Any helpful tips are appreciated!

imputation TOPMed PRS • 638 views
ADD COMMENT

Login before adding your answer.

Traffic: 1376 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6