Hwo to do quality control steps on UKBiobank data?
1
0
Entering edit mode
4.8 years ago
anamaria ▴ 220

Hello,

I downloaded imputed .bgen and .sample files from UKBiobank and now I am planning to do GWAS in it. I plan to use Plink2.

can you please tell me which QC steps I would have to perform?

I was thinking to do these:

-remove related individuals 
-remove non EUR
-remove SNPs with minor allele freq < 0.001
-model using ancestry info

Is there is some standard pipeline to do this in Plink2 or some related files from UKBiobank?

Thanks Ana

ukbiobank • 2.3k views
ADD COMMENT
0
Entering edit mode

Entirely depends on the study. I would recommend thinking about what you're trying to accomplish and exploring the literature for meaningful filtering approaches depending on what you want to do.

ADD REPLY
0
Entering edit mode

Hi,

yes I agree, and I mentioned above those 4 QC steps I plan to do. My question is more how to do this in Plink2? or some other software?

For example to deal with MAF I would do this: plink2 --bgen ukb_imp_chr17_v3.bgen ref-first --sample ukb44316_imp_chr17_v3_s487317.sample --maf 0.001 --make-bpgen --out chr17

But I don't know how to deal with the rest of 3 QC steps. Also I should mentioned this is imputed data from UKBiobank.

Thanks Ana

ADD REPLY
0
Entering edit mode

Hi! Did you manage to solve this?

ADD REPLY
1
Entering edit mode
4.2 years ago
Sam ★ 4.7k

I have a rough Nextflow pipeline for this. You can find the scripts here

You can read the help message to see what file you need and you can read the script to see what actually did the script does.

You will also need the GreedyRelated program I wrote to run the script, which can be found here

ADD COMMENT

Login before adding your answer.

Traffic: 1853 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6