Question: PLINK Quality Control: --impute-sex returns PEDSEX of 0 for one individual. Is Impute-sex necessary?
0
gravatar for yy237
12 months ago by
yy23710
Canada
yy23710 wrote:

Hi,

I am running quality control (sex discrepancy) on my binary plink files. I am new to GWAS and PLINK in general so I am not really sure about what to do in the following:

I used the command plink --bfile file_name --check-sex and found one individual with

PEDSEX: 2, SNPSEX: 0, status: PROBLEM, and F: 0.2303. I was following a GWAS/quality control manual i found online, and removed that individual form my binary files using --remove.

I then used --impute-sex and realized I brought back the individual to my dataset, now with

PEDSEX: 0, SNPSEX: 0, status: PROBLEM, and F: 0.2303.

(1) Is the impute-sex step necessary? If all the statuses of the other individuals were 'OK', would --impute-sex do anything to those individuals? (2) Should I proceed to the next QC step right after removing that one individual, and not use --impute-sex at all?

Thank you very much!

snp gene • 725 views
ADD COMMENTlink modified 12 months ago by chrchang5236.9k • written 12 months ago by yy23710
1
gravatar for chrchang523
12 months ago by
chrchang5236.9k
United States
chrchang5236.9k wrote:

The manual you're following is way out of date. Stop what you're doing and look for a better one before proceeding.

(The distribution of F statistics depends on your dataset; for example, 1000 Genomes has a female with F-statistic > 0.66 which passed quality control, and a few others with F-statistics almost as high. As long as there's a clean separation between the female and male values, the sex check should not cause you to remove samples.)

ADD COMMENTlink written 12 months ago by chrchang5236.9k

Hello, Thank you very much for your reply. I have been trying to find different later protocols for the past two weeks but realize there is a wide variety of approaches and thresholds.... I see that a lot of GWAS papers referring back to the Anderson et al., 2010 paper (Data quality control in genetic case-control association studies), but I'm worried about it not being the most updated either. I then found this PDF (https://cran.r-project.org/web/packages/plinkQC/vignettes/plinkQC.pdf ) from the R website that was written in 2019 March, which I believe should be more applicable... I was wondering if you may have any recommendations/suggestions on some starters/articles which I could look up for more background and updates about GWAS QC. Thank you once again! I really appreciate it.

ADD REPLYlink modified 12 months ago • written 12 months ago by yy23710
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 907 users visited in the last hour