Using vcftools to filter SNPs by a linkage disequilibrium (r2) threshold
1
3
Entering edit mode
6.2 years ago
shyamie ▴ 30

I'm using 1000 genomes vcfs, and I'm trying to thin out SNPs in moderate linkage disequlibrium (r2) using vcftools. In plink, I would do this using the --indep-pairwise parameter, and then excluding the outputted SNPs:

plink --bfile DATA --indep-pairwise 50 10 0.8 --out OUTPUT --noweb
plink --bfile DATA --exclude OUTPUT.prune.out --noweb --make-bed --out DATA_FILTERED

Does anyone know if there is an equivalent one or two step solution to do this using vcftools? I would like to avoid having to convert to plink format and back to vcf, if possible.

Thanks!

vcftools • 12k views
ADD COMMENT
3
Entering edit mode
6.2 years ago

Yes, this is usually done using something like:

vcftools --vcf MyVariants.vcf --hap-r2 --ld-window-bp 10000 --out MyVariants.LD.10Kbp

By the way, if you have issues importing 1000 Genomes data into PLINK, then I cover that in my tutorial (including pruning based on LD): Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

Kevin

ADD COMMENT
1
Entering edit mode

Hi Kevin, Thanks for this response, it is helpful. However, from what I understand, this command will just output a file containing the r2, D, and D’ statistics. Is there a way to actually filter based on r2 after we have this file?

ADD REPLY
0
Entering edit mode

Hi everyone! I've got the same question and am wondering how you can actually prune for LD using VCFTools (not just identify the SNPs that are in LD). I'm wondering if you could use the command --hap-r2-positions <positions list="" file=""> to create a list of positions that are out of LD, and then use the --exclude-positions to prune out the SNPs that are in or out of LD. I'm going to give this a go, but if there are any other suggestions, that would be greatly appreciated!

ADD REPLY
1
Entering edit mode

VCFtools has long been superseded by BCFtools. Please use that. If you have other questions, you may open your own question.

ADD REPLY
1
Entering edit mode

Kia ora (thank you) Kevin! I just saw your other post here. It was very helpful!

VCFtools version for LD calculations specifying bin size

ADD REPLY
1
Entering edit mode

Kia ora bro / dudette!

ADD REPLY

Login before adding your answer.

Traffic: 2102 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6