Question: Duplicate ID in bed file
1
gravatar for janhuang.cn
2.8 years ago by
janhuang.cn150
janhuang.cn150 wrote:

I am using PLINK v1.90b3s 64-bit (17 Jun 2015) to generate a LD matrix from 1000G VCF file for a long list of SNPs.

I use this command to convert VCF to bed file

plink --vcf ALL.chr22.phase3_shapeit2_mvncall_integrated_v5a.20130502.genotypes.vcf.gz --make-bed --out binary_fileset

I then use this command to generate LD statistic reports

plink --r2 --bfile binary_fileset --ld-snp-list snp_chr22_sample.txt --ld-window-r2 0.8

But it returns the below error

Error: Duplicate ID 'rs10656307'.

The SNP file does not contain this SNP. So I think it is the bed file contain duplicated record of rs10656307. Is there a way to remove duplicated SNP in the bed file?

duplicate id bed • 3.4k views
ADD COMMENTlink modified 22 months ago by Sergei Slavskii10 • written 2.8 years ago by janhuang.cn150
1
gravatar for Sergei Slavskii
22 months ago by
Skoltech
Sergei Slavskii10 wrote:

I had the same problem. Solved in two steps:

1) Got all the duplicated ids from the bim file: cut -f 2 ALL.chr1.bim | sort | uniq -d > 1.dups

2) Excluded these ids from the bfile:

plink --bfile ALL.chr1  --exclude 1.dups --make-bed --out ALL.filt.chr1;

With these new filtered files there were no errors while generating LD reports

ADD COMMENTlink written 22 months ago by Sergei Slavskii10

Thanks, this worked for me

ADD REPLYlink written 7 weeks ago by morlacdestructo0
0
gravatar for Floris Brenk
2.8 years ago by
Floris Brenk910
USA
Floris Brenk910 wrote:

Using --list-duplicate-vars you can identify the duplicates in the data plink website identifying duplicates

And using --exlcude you can remove your snps plink website removing snps

ADD COMMENTlink written 2.8 years ago by Floris Brenk910

I used the --list-duplicate-vars to generate a list of duplicated SNPs, but it does not contain the one reported as dulicate in the analysis.

ADD REPLYlink modified 2.8 years ago • written 2.8 years ago by janhuang.cn150

Right, I think --list-duplicate-vars only find SNPs duplicated by position.

ADD REPLYlink written 16 months ago by Phoenix Mu20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1742 users visited in the last hour