Correct command line syntax for plink2
Entering edit mode
10 months ago
dec986 ▴ 300


I'm running PLINK v2.00a2.3LM 64-bit Intel (24 Jan 2020) and I cannot get the command line syntax correct for LD pruning.

I've been reading and but am confused about what I've read there. There are no examples in the manual pages, which isn't very helpful.

First, I convert vcf files thus vcftools --gzvcf $vcf --plink --out $dir/$id

I've tried

~/Scripts/plink2 --noweb --file plink/MDMNFYMQ --indep 50 5 2 --out MDMNFYMQ.indep which I got from the tutorial

but this gives the error: Error: Unrecognized flag ('--file'). which is strange, so I tried

~/Scripts/plink2 --noweb plink/MDMNFYMQ --indep-pairwise 50 5 2 --out MDMNFYMQ.indep

but then I get

Error: Invalid --ndep-pairwise r^2 threshold '2'. For more info, try "plink2 --help <flag name>" or "plink2 --help | more

I'm totally stuck.

How can I run LD pruning with plink2?

plink2 • 640 views
Entering edit mode
10 months ago
  1. Use plink 1.9 for this. --file and --indep are not implemented in plink 2.0 yet.
  2. With both plink 1.9 and plink 2.0, VCF files should be imported directly with --vcf. vcftools --plink is very inefficient and lossy in comparison.
Entering edit mode

I'm having a really hard time getting all of the plink input files ready.

/Scripts/plink1.9/plink --vcf genetic-data/ZZFNMDMF.vcf.gz --out ZZFNMDMF --make-founders --make-bed

generates .nosex, .log, .bed, .fam, .bin files. However, I cannot generate the ped file that is also required:

703404669@bioitutil2:~/covid_study2065$ ~/Scripts/plink1.9/plink --vcf genetic-data/ZZFNMDMF.vcf.gz --out ZZFNMDMF --recode compound-genotypes
PLINK v1.90b6.18 64-bit (16 Jun 2020)
(C) 2005-2020 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to ZZFNMDMF.log.
Options in effect:
  --out ZZFNMDMF
  --recode compound-genotypes
  --vcf genetic-data/ZZFNMDMF.vcf.gz

128738 MB RAM detected; reserving 64369 MB for main workspace.
--vcf: ZZFNMDMF-temporary.bed + ZZFNMDMF-temporary.bim + ZZFNMDMF-temporary.fam
661125 variants loaded from .bim file.
1 person (0 males, 0 females, 1 ambiguous) loaded from .fam.
Ambiguous sex ID written to ZZFNMDMF.nosex .
Using 1 thread (no multithreaded calculations invoked).
Before main variant filters, 1 founder and 0 nonfounders present.
Calculating allele frequencies... done.
Total genotyping rate is 0.925025.
661125 variants and 1 person pass filters and QC.
Note: No phenotypes present.
Error: --recode compound-genotypes cannot be used with multi-character allele

Is there an alternative to Plink that doesn't require so much error-prone preparation?

Entering edit mode
  1. No .ped file is required (and .ped files should almost always be avoided in 2020 since they're horribly inefficient; this is why --file has been deliberately excluded from all plink 2.0 alpha builds). Use --bfile instead of --file to load the .bed + .bim + .fam fileset.
  2. With that said, LD pruning (and several other standard QC operations) requires a dataset with at least ~50 samples. It is not applicable to a single-person dataset.

Login before adding your answer.

Traffic: 803 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6