Question: merging GWAS data sets in Plink, error message: merge-equal-pos failure. Variants 'rs___a' and 'rs__b' have the same position, but do not share the same alleles
0
gravatar for laurenleesc
3.8 years ago by
laurenleesc0 wrote:

Hi,

I am trying to merge two GWAS data sets in Plink.  Apparently there are multiple variant ids that share the same position but have different alleles.  The merge command then errors out.  I wish it would cycle through the remaining file and generate a whole list of these so that I could exclude them all at once.  Does anyone know how to code this?

Thanks!

-------------

C:\Python27\Scripts>plink --bfile AABC_Ziv_Shanghai2 --bmerge CIDR_chr1_cleaned --make-bed --out CIDR_chr1_AABC_Ziv_shanghai --merge-equal-pos
PLINK v1.90b3.27 64-bit (13 Dec 2015)      https://www.cog-genomics.org/plink2
(C) 2005-2015 Shaun Purcell, Christopher Chang   GNU General Public License v3
Logging to CIDR_chr1_AABC_Ziv_shanghai.log.
Options in effect:
  --bfile AABC_Ziv_Shanghai2
  --bmerge CIDR_chr1_cleaned
  --make-bed
  --merge-equal-pos
  --out CIDR_chr1_AABC_Ziv_shanghai

7944 MB RAM detected; reserving 3972 MB for main workspace.
6320 people loaded from AABC_Ziv_Shanghai2.fam.
4001 people to be merged from CIDR_chr1_cleaned.fam.
Of these, 4001 are new, while 0 are present in the base dataset.
1449016 markers loaded from AABC_Ziv_Shanghai2.bim.
176885 markers to be merged from CIDR_chr1_cleaned.bim.
Of these, 143496 are new, while 33389 are present in the base dataset.
Warning: Variants 'rs3094315' and '1:752566' have the same position.
Warning: Variants 'rs4040617' and 'kgp5225889' have the same position.
Warning: Variants 'rs28609852' and 'kgp3324955' have the same position.
Error: --merge-equal-pos failure.  Variants 'rs17026104' and 'kgp4275897' have
the same position, but do not share the same alleles.

error plink merge gwas • 2.1k views
ADD COMMENTlink modified 6 weeks ago by h.d.green0 • written 3.8 years ago by laurenleesc0

Hi, Could you fine the solution and the reason for this problem? The first data set of mien is the 1000G reference panel (as controls) and the second one is cases. when I want to merge them with Plink I have this problem even when I cleaned the data.

Could you tell me what is the solution? Thank you in advance.

ADD REPLYlink written 3.0 years ago by fatima10
0
gravatar for h.d.green
6 weeks ago by
h.d.green0
h.d.green0 wrote:

I have a very rough solution to this

awk '{print $1,$4}' data.bim > bps

will give you a file of chromosome base pairs

sort bps | uniq -c | awk '($1>1)' | awk '{print $3'} > dupbps

Will give a list of those that are duplicated and spit out the base pair number

grep -wf dupbps file.bim | awk '{print $2'} > excbp

Will output a list of snps to exclude.

It's not perfect but it's the best I can get. Been smashing my head against this lately

ADD COMMENTlink modified 6 weeks ago • written 6 weeks ago by h.d.green0

Have you tried using plink 2.0’s —set-all-var-ids flag on all datasets beforehand, instead of —merge-equal-pos?

ADD REPLYlink written 6 weeks ago by chrchang5235.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1028 users visited in the last hour