(Plink) Sex as a covariate: What's the coding for male/female?
1
1
Entering edit mode
4 months ago
Armaboo111 ▴ 10

What is the code for male and female in plink covariate files? Is it (male = 1, female = 0), or ('1' = male, '2' = female, '0' = unknown)? Is the code for gender different in covariate and fam. files?

I'm wondering about the inconsistent statements in the the manual of plink 1.9, as showed below. Thank you in advance.

As mentioned in the manual of plink 1.9 which discusses the coding for male/female of covariate files: "By default, when at least one male and one female is present, sex (male = 1, female = 0) is automatically added as a covariate on X chromosome SNPs, and nowhere else."

As mentioned in the manual of plink 1.9 which discusses the coding for male/female of .fam files: Sex code ('1' = male, '2' = female, '0' = unknown).

sex plink covariate • 399 views
2
Entering edit mode
4 months ago

The expected coding in the .fam file is male='1', female='2'; this is then coded by plink 1.x as male=1, female=0 during the --linear/--logistic regression. In other words, you would get the same results (outside of chrX) with --linear sex as you would with --linear combined with a covariate file with a male=1, female=0 sex covariate added.

Yes, this is counterintuitive, so I got rid of this discrepancy in plink 2.0. From its --glm documentation: "Note that PLINK 2.0 encodes the .fam/.psam sex covariate as male = 1, female = 2, to match the actual numbers in the input file. This is a minor change from PLINK 1.x." So with PLINK 2.0, if you use male=1, female=2 coding in both file types, you don't have to worry about the sign of the sex beta coefficient changing on you with .fam vs. --covar.

(With that said, even with PLINK 1.x, you don't have to worry about any p-values being affected by the 1/2 vs. 1/0 coding.)

0
Entering edit mode

Thank you so much, chrchang523!