Question: Using PLINK to filter VCF files
0
gravatar for Diego.Morales
4.1 years ago by
Diego.Morales10 wrote:

Hello,

I am trying to filter a VCF file based on the R2 value that each SNP has. I would like to only keep the entries which have a R2 >= 0.3.

Are these the right commands/procedure?

plink --vcf dataset.dose.vcf ----recode --out dataset.dose #(vcf to map/ped)

plink --file dataset.dose  --indep-pairwise  1345835 5 0.3 --out dataset.dose #(to get a list of SNPs R2 >= 0.3)

plink --file dataset.dose --extract plink.prune.in --make-bed --out pruneddata #(to perform the extraction of the SNPs and convert map/ped files to bed/bim/fam)

Diego

plink gwas vcf • 3.3k views
ADD COMMENTlink modified 17 months ago by Kevin Blighe63k • written 4.1 years ago by Diego.Morales10
0
gravatar for Kevin Blighe
17 months ago by
Kevin Blighe63k
Kevin Blighe63k wrote:

Just be careful because PLINK assumes that you want to exclude the variants with r-squared >0.3, i.e., the variants that are in linkage disequilibrium. As your intention is to include these variants, you should be extracting variants from the plink.prune.out file.

So, 2 possible ways to get what you want:

  • --extract plink.prune.out
  • --exclude plink.prune.in

From the manual:

Variant pruning

--indep <window size="">['kb'] <step size="" (variant="" ct)&gt;="" <vif="" threshold="">

--indep-pairwise <window size="">['kb'] <step size="" (variant="" ct)&gt;="" <r^2="" threshold="">

--indep-pairphase <window size="">['kb'] <step size="" (variant="" ct)&gt;="" <r^2="" threshold="">

These commands produce a pruned subset of markers that are in approximate linkage equilibrium with each other, writing the IDs to plink.prune.in (and the IDs of all excluded variants to plink.prune.out).

[source: https://www.cog-genomics.org/plink/1.9/ld]

Also, your window size of 1345835 seems very random.

Kevin

ADD COMMENTlink modified 17 months ago • written 17 months ago by Kevin Blighe63k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 666 users visited in the last hour