Low SNP imputation concordance - except chromosome 1
0
0
Entering edit mode
5.7 years ago
muraved ▴ 10

Hi,

I'm trying to do SNP imputation using IMPUTE2. My genotype data (bim/bed/fam) is hg18, with the usual quality control, mostly European ancestry, reference panel is hg19.

My workflow:

  1. Determine flipped and ambiguous SNPs relative to hg18, using snpflip (https://github.com/biocore-ntnu/snpflip ).
  2. Use PLINK to create map/ped files, with flipped SNPs flipped and ambiguous removed.
  3. Computed liftover to hg19 using liftOverPlink.py (https://github.com/sritchie73/liftOverPlink ).
  4. Using gtools to transform ped/map to gen/sample. Final SNP counts per chromosome in the .gen file:

    63234   1
    64938   2
    53168   3
    49266   4
    49529   5
    49570   6
    41084   7
    42120   8
    35963   9
    42686   10
    39260   11
    37593   12
    30342   13
    24798   14
    22896   15
    23652   16
    18379   17
    23539   18
    10569   19
    20286   20
    11094   21
    10292   22
    
  5. Using IMPUTE2 with hg19 reference panel (https://mathgen.stats.ox.ac.uk/impute/1000GP_Phase3.html ), chunk size 5000000, -Ne 20000 and -filt_rules_l 'EUR==0'.

I'm parsing the top-right entry in the concordance table (see https://mathgen.stats.ox.ac.uk/impute/impute_v2.html#concordance_tables ), using

cat $summaryfile | grep -A1 "Concordance" | grep -v "Concordance" | tr -s ' ' | cut -d ' ' -f 8,9 | grep -v -- -- | tr -d '\n'; echo

When plotting these values by chromosome, it turns out that chr1 has good concordance as expected (~95%), whereas all others are pretty bad:

test CEU

The same happens when not filtering for EUR.

I'm at a loss here, any idea what could be causing this?

SNP IMPUTE2 PLINK imputation GWAS • 1.7k views
ADD COMMENT

Login before adding your answer.

Traffic: 2710 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6