Hello all,
I'm using PLINK2 to do assication study in a case-control trait. I have 4 covariates.
Here is my command:
>plink2 --vcf hdplus.new1.filter.chr1.vcf.gz --const-fid 0 --no-parents --no-sex --allow-no-sex -pheno pheno.file --all-pheno --covar covar.cov --logistic --out hdplus.chr1.filter.cov
PLINK2 ran smoothly without any error message, but it gave me NA results for all TEST (ADD+TEST for all 4 covariates).
"199 variants loaded from .bim file.
9105 people (0 males, 0 females, 9105 ambiguous) loaded from .fam.
Ambiguous sex IDs written to hdplus.chr1.filter.cov.nosex .
9105 phenotype values present after --pheno.
Using 1 thread (no multithreaded calculations invoked).
--covar: 1 out of 4 covariates loaded.
Before main variant filters, 9105 founders and 0 nonfounders present.
Calculating allele frequencies... done.
199 variants and 9105 people pass filters and QC.
Among remaining phenotypes, 1802 are cases and 7303 are controls.
9105 phenotype values present after --pheno.
johnes has 1802 cases and 7303 controls.
Writing logistic model association results to
hdplus.chr1.filter.cov.johnes.assoc.logistic ... done."
If I removed the command line "--covar covar.cov", it gave me results for ADD TEST. Does anybody know how to fix this? Thank you!
Hi,
Can you send me a small test dataset I can use to reproduce this issue?
Phenotype file:
FID IID Pheno
0 13842839 1
0 13605801 1
0 13135844 2
0 13216263 2
covar file:
FID IID yob prop_jer prop_hf prop_os
0 13842839 1996 0.5000 0.5000 0.1562
0 13605801 1996 0.0000 1.0000 0.6875
0 13135844 1995 0.5000 0.5000 0.0000
0 13216263 1995 0.5000 0.5000 0.0625
VCF file:
##fileformat=VCFv4.1
##fileDate=2014-12-31 01:03:25
##source="b4.jar (r1196)"
##INFO=<ID=AF,Number=A,Type=Float,Description="Estimated Allele Frequencies">
##INFO=<ID=AR2,Number=1,Type=Float,Description="Allelic R-Squared: estimated correlation between most probable ALT dose and true ALT dose">
##INFO=<ID=DR2,Number=1,Type=Float,Description="Dosage R-Squared: estimated correlation between estimated ALT dose [P(RA) + 2*P(AA)] and true ALT dose">
##INFO=<ID=MAF,Number=1,Type=Float,Description="Minor allele frequency according to genotype probabilities">
##INFO=<ID=MAFc,Number=1,Type=Float,Description="Minor allele frequency according to calls">
##FORMAT=<ID=DS,Number=1,Type=Float,Description="estimated ALT dose [P(RA) + P(AA)]">
##FORMAT=<ID=GP,Number=G,Type=Float,Description="Estimated Genotype Probability">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT 13135844 13216263 13605801 13842839
Chr1 45002577 rs41695900 T C . PASS AR2=0.710;DR2=0.773;AF=0.796;MAF=0.204;MAFc=0.191 GT:DS:GP 1|1:1.81:0,0.189,0.81 0|1:1.411:0.001,0.588,0.412 1|1:1.999:0,0.001,0.999 1|0:0.853:0.161,0.825,0.014
Command:
plink2 --vcf test.vcf.gz --const-fid 0 --no-parents --no-sex --allow-no-sex --pheno test.pheno --all-pheno --covar test.cov --covar-name yob --linear --out test
If remove --covar test.cov --covar-name yob, it could give association result for one SNP.
Thank you!
In this test dataset, the problem is caused by multicollinearity: Pheno and yob are perfectly negatively correlated. It's necessary to not include yob as a covariate in this situation.