SNPs in high LD after LD pruning
Entering edit mode
5.7 years ago
Ana ▴ 200

Hello all,

I have question, sorry in advance if it is so simple but I am just slightly puzzled. I am trying to use LD pruned set of SNPs for my population genomics analysis. I am using SNPrelate and directly parse the vcf file into the package. For the species that I am working with, LD decays quite rapidly, with a big drop off even in the first kb, and a plateau nearly reached after 10 kb. When I first ran the SNPrelate to generate LD pruned set of SNPs, I set up the parameters like this :

snpset_pruned <- snpgdsLDpruning(genofile,ld.threshold=0.2, win.size = 50000, maf = 0.03, autosome.only=F)

When I got the LD pruned set and estimated LD, still I found so many SNPs with high correlation values. I do not know why this is happening considering LD decays decays rapidly? I contacted the SNPrelate developer and he suggested I should increase the size of sliding windows so I changed my code to this:

snpset_pruned <- snpgdsLDpruning(genofile,ld.threshold=0.2, win.size =2000000, maf = 0.03, autosome.only=F)

I am even more puzzled now, because although the size of window is quite large, I still find some SNPs with high correlation values.

Could someone tell me why this is happening (long range LD)? or whether I am misunderstanding something here? Thanks

LD SNPrelate • 2.2k views
Entering edit mode
5.7 years ago

You should consider setting the value of composite, which can be one of:

  • composite
  • r
  • dprime
  • corr

Then, you will have greater control of the other parameters, namely ld.threshold=0.2. Try modifying this to see how it affects the results.

Also note the settings that I use for LD pruning here in Step #7: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2



Login before adding your answer.

Traffic: 2076 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6