Question: Duplicate ID in bed file
gravatar for
6 months ago by
janhuang.cn70 wrote:

I am using PLINK v1.90b3s 64-bit (17 Jun 2015) to generate a LD matrix from 1000G VCF file for a long list of SNPs.

I use this command to convert VCF to bed file

plink --vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --make-bed --out binary_fileset

I then use this command to generate LD statistic reports

plink --r2 --bfile binary_fileset --ld-snp-list snp_chr22_sample.txt --ld-window-r2 0.8

But it returns the below error

Error: Duplicate ID 'rs10656307'.

The SNP file does not contain this SNP. So I think it is the bed file contain duplicated record of rs10656307. Is there a way to remove duplicated SNP in the bed file?

duplicate id bed • 467 views
ADD COMMENTlink modified 6 months ago by Floris Brenk790 • written 6 months ago by janhuang.cn70
gravatar for Floris Brenk
6 months ago by
Floris Brenk790
Floris Brenk790 wrote:

Using --list-duplicate-vars you can identify the duplicates in the data plink website identifying duplicates

And using --exlcude you can remove your snps plink website removing snps

ADD COMMENTlink written 6 months ago by Floris Brenk790

I used the --list-duplicate-vars to generate a list of duplicated SNPs, but it does not contain the one reported as dulicate in the analysis.

ADD REPLYlink modified 6 months ago • written 6 months ago by janhuang.cn70
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 880 users visited in the last hour