Hi everybody,
I have just run a GWAS (linear regression with covariates) in PLINK (v1.90b3.38). I got a locus with several significantly associated SNPs and I am now running a conditional model (using the PLINK's --condition-list
option) aiming at finding secondary association signals. Specifically, I am conditioning on my top-associated SNP (saved in a text file) and I am including only the variants in the identified locus (by selecting only the SNPs in that locus with the PLINK's --include option).
This is the command I am running the following command:
plink --bfile myplink --extract mysnsp.txt --pheno mypheno.txt --covar mycovar.txt
--no-parents --allow-no-sex --missing-phenotype -9 --maf 0.05 --geno 0.05
--hwe 1e-09 --linear --ci 0.95 --condition-list best.txt --out conditional_model
However, my dataset includes some duplicated variants such as:
19 19:12161188 0 12161188 TT T
19 19:12161188 0 12161188 T TTA
and the conditional analysis terminates with the following error:
...
Phenotype data is quantitative.
Error: Duplicate ID '19:12161188'.
Let me stress that a normal association study (that is, not including the SNP with the --condition-list
option) works fine.
I am now manually adding the allelic dosage of the top-associated SNP to the set of covariates, but I wonder if there is a more elegant solution (and also why duplicate variants are a problem for conditional but not for association analyses).
Thank you very much!