Merging multiple PLINK files
1
2
Entering edit mode
11 months ago
Nejla ▴ 20

Hello, I've been trying to merge multiple PLINK files (bed bim fam). I have 6 datasets that are genotyped on different genotyping arrays. So technically when I try to merge them using the command

plink --bfile dataset1 --merge-list all_lists.txt --make-bed --out merged_data

I get multiple warnings and error

`Warning: Multiple positions seen for variant 'rs6687776'.
Warning: Multiple positions seen for variant 'rs2887286'.
Warning: Multiple positions seen for variant 'rs3813199'.
Warning: Multiple chromosomes seen for variant 'rs10128688'.
Warning: Multiple chromosomes seen for variant 'rs10106770'.
Warning: Multiple chromosomes seen for variant 'rs2097173'.
Warning: Multiple chromosomes seen for variant 'rs10064939'.
Warning: Multiple chromosomes seen for variant 'rs10059910'.
Warning: Multiple chromosomes seen for variant 'rs11857958'.
Warning: Multiple chromosomes seen for variant 'rs11757628'.
Warning: Multiple chromosomes seen for variant 'rs11162247'.
Warning: Multiple chromosomes seen for variant 'rs13074336'.
Warning: Multiple chromosomes seen for variant 'rs2371122'.
Warning: Multiple chromosomes seen for variant 'rs41431048'.
Warning: Multiple chromosomes seen for variant 'rs13092372'.
Warning: Multiple chromosomes seen for variant 'rs13151824'.
Warning: Multiple chromosomes seen for variant 'rs2187291'.
Warning: Multiple chromosomes seen for variant 'rs13413435'.
Warning: Multiple chromosomes seen for variant 'rs11025370'.
Warning: Multiple chromosomes seen for variant 'rs9798668'.
Warning: Multiple chromosomes seen for variant 'rs2569201'.
Warning: Multiple chromosomes seen for variant 'rs12043679'.
937794 more multiple-position warnings: see log file.
Error: 126705 variants with 3+ alleles present.
* If you believe this is due to strand inconsistency, try --flip with
  test_merge-merge.missnp.
  (Warning: if this seems to work, strand errors involving SNPs with A/T or C/G
  alleles probably remain in your data.  If LD between nearby SNPs is high,
  --flip-scan should detect them.)
* If you are dealing with genuine multiallelic variants, we recommend exporting
  that subset of the data to VCF (via e.g. '--recode vcf'), merging with
  another tool/script, and then importing the result; PLINK is not yet suited
  to handling them.
See https://www.cog-genomics.org/plink/1.9/data#merge3 for more discussion.`

So I flipped all the datasets, then excluded all the snps causing the errors and tried the merge again. It worked, however I ended up having a genotyping rate of 0.2 which is very low. Does anyone know how I can merge all the datasets and keep the maximum number of snp and a high genotyping rate? Thank you!

plink merging • 792 views
ADD COMMENT
0
Entering edit mode
24 days ago

Hello Nejla! I am having same problem. Did you solved it? If you dealt with it can you say me how? That would be great!

ADD COMMENT

Login before adding your answer.

Traffic: 2815 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6