Question: Merging inconsistencies in PLINK
0
gravatar for GabrielMontenegro
2.5 years ago by
United Kingdom
GabrielMontenegro540 wrote:

I am trying to merge two different PLINK files genotyped on the same platform. However, I found some inconsistencies that could not be solved by the --flip command. I checked the SNPs that could not be merged after the flip and found this type of problem:

File 1:

4    rs10000432    68.93    47511781    T    C

File 2:

4    rs10000432    68.93    47511781    A    C

One of the alleles is different, while the other allele is the same on both files.

What should I do on this case? I wasn't expecting to find these type of issues on these datasets since both were genotyped on the same platform.

snp plink • 1.5k views
ADD COMMENTlink modified 2.3 years ago by Biostar ♦♦ 20 • written 2.5 years ago by GabrielMontenegro540

Just an initial comment: for that particular SNP, the ancestral 'reference' allele is C (https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=10000432).

When you attempted to merge the first time, PLINK may have output a file with the extension *.missnp, which would contain these multi-allelic sites. You can remove these from your dataset with the following command:

plink --noweb --bfile MyData1 --exclude MyData.missnp --make-bed --out MyData1.Pruned ;
plink --noweb --bfile MyData2 --exclude MyData.missnp --make-bed --out MyData2.Pruned ;

Then attempt to merge again.

If you don't want to necessarily remove these, then you may have to do more rigorous data preparation. In which format was your data initially - VCF?? It would be useful to run every genotype against a reference genome and ensure that the ref>alt order is maintained.

ADD REPLYlink modified 2.5 years ago • written 2.5 years ago by Kevin Blighe55k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 669 users visited in the last hour