Question: PLINK2 association study using covariate regression model
gravatar for tracylee7001
5.8 years ago by
New Zealand
tracylee700120 wrote:

Hello all,

I'm using PLINK2 to do assication study in a case-control trait. I have 4 covariates.

Here is my command:

>plink2 --vcf hdplus.new1.filter.chr1.vcf.gz --const-fid 0 --no-parents --no-sex --allow-no-sex -pheno pheno.file --all-pheno  --covar covar.cov --logistic --out hdplus.chr1.filter.cov

PLINK2 ran smoothly without any error message, but it gave me NA results for all TEST (ADD+TEST for all 4 covariates).

"199 variants loaded from .bim file.
9105 people (0 males, 0 females, 9105 ambiguous) loaded from .fam.
Ambiguous sex IDs written to hdplus.chr1.filter.cov.nosex .
9105 phenotype values present after --pheno.
Using 1 thread (no multithreaded calculations invoked).
--covar: 1 out of 4 covariates loaded.
Before main variant filters, 9105 founders and 0 nonfounders present.
Calculating allele frequencies... done.
199 variants and 9105 people pass filters and QC.
Among remaining phenotypes, 1802 are cases and 7303 are controls.
9105 phenotype values present after --pheno.
johnes has 1802 cases and 7303 controls.
Writing logistic model association results to
hdplus.chr1.filter.cov.johnes.assoc.logistic ... done."


 If I removed the command line "--covar covar.cov", it gave me results for ADD TEST. Does anybody know how to fix this? Thank you!

snp software error genome • 4.5k views
ADD COMMENTlink written 5.8 years ago by tracylee700120


Can you send me a small test dataset I can use to reproduce this issue?

ADD REPLYlink written 5.8 years ago by chrchang5237.6k

Phenotype file:

0 13842839 1
0 13605801 1
0 13135844 2
0 13216263 2

covar file:
FID IID yob prop_jer prop_hf prop_os
0 13842839 1996 0.5000 0.5000 0.1562
0 13605801 1996 0.0000 1.0000 0.6875
0 13135844 1995 0.5000 0.5000 0.0000
0 13216263 1995 0.5000 0.5000 0.0625


VCF file:

##fileDate=2014-12-31 01:03:25
##source="b4.jar (r1196)"
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated Allele Frequencies">
##INFO=<ID=AR2,Number=1,Type=Float,Description="Allelic R-Squared: estimated correlation between most probable ALT dose and true ALT dose">
##INFO=<ID=DR2,Number=1,Type=Float,Description="Dosage R-Squared: estimated correlation between estimated ALT dose [P(RA) + 2*P(AA)] and true ALT dose">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Minor allele frequency according to genotype probabilities">
##INFO=<ID=MAFc,Number=1,Type=Float,Description="Minor allele frequency according to calls">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="estimated ALT dose [P(RA) + P(AA)]">
##FORMAT=<ID=GP,Number=G,Type=Float,Description="Estimated Genotype Probability">
#CHROM  POS     ID      REF     ALT     QUAL    FILTER  INFO    FORMAT  13135844        13216263        13605801        13842839
Chr1    45002577        rs41695900      T       C       .       PASS    AR2=0.710;DR2=0.773;AF=0.796;MAF=0.204;MAFc=0.191       GT:DS:GP        1|1:1.81:0,0.189,0.81   0|1:1.411:0.001,0.588,0.412     1|1:1.999:0,0.001,0.999 1|0:0.853:0.161,0.825,0.014



plink2 --vcf test.vcf.gz --const-fid 0 --no-parents --no-sex --allow-no-sex --pheno test.pheno  --all-pheno --covar test.cov --covar-name yob --linear --out test

If remove --covar test.cov --covar-name yob, it could give association result for one SNP.

Thank you!

ADD REPLYlink written 5.8 years ago by tracylee700120

In this test dataset, the problem is caused by multicollinearity: Pheno and yob are perfectly negatively correlated.  It's necessary to not include yob as a covariate in this situation.

ADD REPLYlink written 5.8 years ago by chrchang5237.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1778 users visited in the last hour