Question: Merging inconsistencies in PLINK
0
gravatar for GabrielMontenegro
19 months ago by
United Kingdom
GabrielMontenegro520 wrote:

I am trying to merge two different PLINK files genotyped on the same platform. However, I found some inconsistencies that could not be solved by the --flip command. I checked the SNPs that could not be merged after the flip and found this type of problem:

File 1:

4    rs10000432    68.93    47511781    T    C

File 2:

4    rs10000432    68.93    47511781    A    C

One of the alleles is different, while the other allele is the same on both files.

What should I do on this case? I wasn't expecting to find these type of issues on these datasets since both were genotyped on the same platform.

snp plink • 939 views
ADD COMMENTlink modified 17 months ago by Biostar ♦♦ 20 • written 19 months ago by GabrielMontenegro520

Just an initial comment: for that particular SNP, the ancestral 'reference' allele is C (https://www.ncbi.nlm.nih.gov/projects/SNP/snp_ref.cgi?rs=10000432).

When you attempted to merge the first time, PLINK may have output a file with the extension *.missnp, which would contain these multi-allelic sites. You can remove these from your dataset with the following command:

plink --noweb --bfile MyData1 --exclude MyData.missnp --make-bed --out MyData1.Pruned ;
plink --noweb --bfile MyData2 --exclude MyData.missnp --make-bed --out MyData2.Pruned ;

Then attempt to merge again.

If you don't want to necessarily remove these, then you may have to do more rigorous data preparation. In which format was your data initially - VCF?? It would be useful to run every genotype against a reference genome and ensure that the ref>alt order is maintained.

ADD REPLYlink modified 19 months ago • written 19 months ago by Kevin Blighe41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour