Question: Using vcftools to filter SNPs by a linkage disequilibrium (r2) threshold
1
gravatar for shyamie
9 months ago by
shyamie10
shyamie10 wrote:

I'm using 1000 genomes vcfs, and I'm trying to thin out SNPs in moderate linkage disequlibrium (r2) using vcftools. In plink, I would do this using the --indep-pairwise parameter, and then excluding the outputted SNPs:

plink --bfile DATA --indep-pairwise 50 10 0.8 --out OUTPUT --noweb
plink --bfile DATA --exclude OUTPUT.prune.out --noweb --make-bed --out DATA_FILTERED

Does anyone know if there is an equivalent one or two step solution to do this using vcftools? I would like to avoid having to convert to plink format and back to vcf, if possible.

Thanks!

vcftools • 1.4k views
ADD COMMENTlink modified 9 months ago by Kevin Blighe32k • written 9 months ago by shyamie10
0
gravatar for Kevin Blighe
9 months ago by
Kevin Blighe32k
Republic of Ireland
Kevin Blighe32k wrote:

Yes, this is usually done using something like:

vcftools --vcf MyVariants.vcf --hap-r2 --ld-window-bp 10000 --out MyVariants.LD.10Kbp

By the way, if you have issues importing 1000 Genomes data into PLINK, then I cover that in my tutorial (including pruning based on LD): Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

Kevin

ADD COMMENTlink modified 9 months ago • written 9 months ago by Kevin Blighe32k
1

Hi Kevin, Thanks for this response, it is helpful. However, from what I understand, this command will just output a file containing the r2, D, and D’ statistics. Is there a way to actually filter based on r2 after we have this file?

ADD REPLYlink written 5 months ago by danselechnik10

Hi everyone! I've got the same question and am wondering how you can actually prune for LD using VCFTools (not just identify the SNPs that are in LD). I'm wondering if you could use the command --hap-r2-positions <positions list="" file=""> to create a list of positions that are out of LD, and then use the --exclude-positions to prune out the SNPs that are in or out of LD. I'm going to give this a go, but if there are any other suggestions, that would be greatly appreciated!

ADD REPLYlink written 29 days ago by sgalla3230
1

VCFtools has long been superseded by BCFtools. Please use that. If you have other questions, you may open your own question.

ADD REPLYlink written 29 days ago by Kevin Blighe32k
1

Kia ora (thank you) Kevin! I just saw your other post here. It was very helpful!

VCFtools version for LD calculations specifying bin size

ADD REPLYlink written 28 days ago by sgalla3230
1

Kia ora bro / dudette!

ADD REPLYlink written 28 days ago by Kevin Blighe32k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 770 users visited in the last hour