plink --indep LD calculation and filtering to VCF file in one command
1
0
Entering edit mode
3.1 years ago
rjzotti • 0

I want to use plink's linkage disequilibrium feature to filter my VCF file. I'm new to genomics, but after reading plink's documentation, I assumed I could do this in one command:

plink \
--bcf /input/${CHROMOSOME_ID}.vcf.gz \
--indep $LD_WINDOW_SIZE_KB $LD_STEP_SIZE $VIF_THRESHOLD \
--recode vcf \
--out /output/ch${CHROMOSOME_ID} \
--allow-extra-chr

I then use the output file, e.g., ch6.vcf, for downstream analysis. I never bothered touching the .in and .out files because according to the plink data docs:

--recode creates a new text fileset, after applying sample/variant filters and other operations.

so I assumed plink's --recode would interpret my $VIF_THRESHOLD as a variant filter operation. However, in other, older biostars posts I've read that you have to do the filtering using .in or .out in a separate command. Is my original command incorrect?

plink • 1.1k views
ADD COMMENT
2
Entering edit mode
3.1 years ago
Sam ★ 4.7k

Based on my understanding, --indep doesn't perform filtering, but generate file containing the pruned SNPs. So you should do --extract or --exclude in downstream analysis to filter out the SNPs. Documentation is here

ADD COMMENT

Login before adding your answer.

Traffic: 2908 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6