Question: Using vcftools to filter SNPs by a linkage disequilibrium (r2) threshold
1
gravatar for shyamie
11 months ago by
shyamie10
shyamie10 wrote:

I'm using 1000 genomes vcfs, and I'm trying to thin out SNPs in moderate linkage disequlibrium (r2) using vcftools. In plink, I would do this using the --indep-pairwise parameter, and then excluding the outputted SNPs:

plink --bfile DATA --indep-pairwise 50 10 0.8 --out OUTPUT --noweb
plink --bfile DATA --exclude OUTPUT.prune.out --noweb --make-bed --out DATA_FILTERED

Does anyone know if there is an equivalent one or two step solution to do this using vcftools? I would like to avoid having to convert to plink format and back to vcf, if possible.

Thanks!

vcftools • 1.8k views
ADD COMMENTlink modified 11 months ago by Kevin Blighe35k • written 11 months ago by shyamie10
0
gravatar for Kevin Blighe
11 months ago by
Kevin Blighe35k
Republic of Ireland
Kevin Blighe35k wrote:

Yes, this is usually done using something like:

vcftools --vcf MyVariants.vcf --hap-r2 --ld-window-bp 10000 --out MyVariants.LD.10Kbp

By the way, if you have issues importing 1000 Genomes data into PLINK, then I cover that in my tutorial (including pruning based on LD): Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

Kevin

ADD COMMENTlink modified 11 months ago • written 11 months ago by Kevin Blighe35k
1

Hi Kevin, Thanks for this response, it is helpful. However, from what I understand, this command will just output a file containing the r2, D, and D’ statistics. Is there a way to actually filter based on r2 after we have this file?

ADD REPLYlink written 7 months ago by danselechnik10

Hi everyone! I've got the same question and am wondering how you can actually prune for LD using VCFTools (not just identify the SNPs that are in LD). I'm wondering if you could use the command --hap-r2-positions <positions list="" file=""> to create a list of positions that are out of LD, and then use the --exclude-positions to prune out the SNPs that are in or out of LD. I'm going to give this a go, but if there are any other suggestions, that would be greatly appreciated!

ADD REPLYlink written 3 months ago by sgalla3230
1

VCFtools has long been superseded by BCFtools. Please use that. If you have other questions, you may open your own question.

ADD REPLYlink written 3 months ago by Kevin Blighe35k
1

Kia ora (thank you) Kevin! I just saw your other post here. It was very helpful!

VCFtools version for LD calculations specifying bin size

ADD REPLYlink written 3 months ago by sgalla3230
1

Kia ora bro / dudette!

ADD REPLYlink written 3 months ago by Kevin Blighe35k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1118 users visited in the last hour