I have tried to use plink2 to check the gender situation of my several datasets. It always shows more problem samples than expected. For some datasets, I also have RNAseq data. Then I checked again with some genes just on chromosome X or Y. It turned out a few problem samples. Therefore, I doubt the reliability of plink sex check. Or maybe, I have some wrong steps/parameters in my codes. The weirdest thing is plink2 can not scan variants on CHROM Y while a number of variants (7375) do exist. The below is an example log file. In addition, there are 45911 variants on CHROM X, but in the log file, it scans 40431. Any advice or comments will be appreciated.
PLINK v1.90b4 64-bit (20 Mar 2017) www.cog-genomics.org/plink/1.9/ (C) 2005-2017 Shaun Purcell, Christopher Chang GNU General Public License v3 Logging to WGS-GATK-bcftools-Plink-update3.qc.log. Options in effect: --bfile WGS-GATK-bcftools-Plink-update3 --check-sex 0.35 0.65 --noweb --out WGS-GATK-bcftools-Plink-update3.qc
Note: --noweb has no effect since no web check is implemented yet. 16384 MB RAM detected; reserving 8192 MB for main workspace. 2120644 variants loaded from .bim file. 166 people (93 males, 73 females) loaded from .fam. 158 phenotype values loaded from .fam. Using 1 thread (no multithreaded calculations invoked). Before main variant filters, 166 founders and 0 nonfounders present. Calculating allele frequencies... done. Warning: 904336 het. haploid genotypes present (see WGS-GATK-bcftools-Plink-update3.qc.hh ); many commands treat these as missing. Warning: Nonmissing nonmale Y chromosome genotype(s) present; many commands treat these as missing. Total genotyping rate is 0.928209. 2120644 variants and 166 people pass filters and QC. Among remaining phenotypes, 142 are cases and 16 are controls. (8 phenotypes are missing.) --check-sex: 40431 Xchr and 0 Ychr variant(s) scanned, 86 problems detected. Report written to WGS-GATK-bcftools-Plink-update3.qc.sexcheck .