Question: Hwo to do quality control steps on UKBiobank data?
gravatar for anamaria
15 months ago by
anamaria110 wrote:


I downloaded imputed .bgen and .sample files from UKBiobank and now I am planning to do GWAS in it. I plan to use Plink2.

can you please tell me which QC steps I would have to perform?

I was thinking to do these:

-remove related individuals 
-remove non EUR
-remove SNPs with minor allele freq < 0.001
-model using ancestry info

Is there is some standard pipeline to do this in Plink2 or some related files from UKBiobank?

Thanks Ana

ukbiobank • 470 views
ADD COMMENTlink modified 8 months ago by Sam3.2k • written 15 months ago by anamaria110

Entirely depends on the study. I would recommend thinking about what you're trying to accomplish and exploring the literature for meaningful filtering approaches depending on what you want to do.

ADD REPLYlink modified 15 months ago • written 15 months ago by Brice Sarver3.5k


yes I agree, and I mentioned above those 4 QC steps I plan to do. My question is more how to do this in Plink2? or some other software?

For example to deal with MAF I would do this: plink2 --bgen ukb_imp_chr17_v3.bgen ref-first --sample ukb44316_imp_chr17_v3_s487317.sample --maf 0.001 --make-bpgen --out chr17

But I don't know how to deal with the rest of 3 QC steps. Also I should mentioned this is imputed data from UKBiobank.

Thanks Ana

ADD REPLYlink written 15 months ago by anamaria110

Hi! Did you manage to solve this?

ADD REPLYlink written 8 months ago by catarinaglmg10
gravatar for Sam
8 months ago by
New York
Sam3.2k wrote:

I have a rough Nextflow pipeline for this. You can find the scripts here

You can read the help message to see what file you need and you can read the script to see what actually did the script does.

You will also need the GreedyRelated program I wrote to run the script, which can be found here

ADD COMMENTlink written 8 months ago by Sam3.2k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1130 users visited in the last hour