Question: Window-size in PLINK's indep-pairwise LD pruning
0
gravatar for solion
2.5 years ago by
solion0
solion0 wrote:

I am pruning datasets of varying SNP density using PLINK --indep-pairwise, comparing different r2 cut-offs. The density ranges from the 1000 Genomes phase 3 data (e.g. >6 million SNPs on chr 2) to that of SNP-array data (20.000 SNPs on chr 2). While doing so, I want to keep the other parameters (window-size and frame-shift) constant.

My current parameter choices are: --indep-pairwise 10000 1000 [r2-cut-off, which varies in a range from 0.5-0.95]

Is there a downside to choosing a large window-size like 10000 on less dense data or a high r2-cut-off other than run-time?

pruning snp plink genome • 5.4k views
ADD COMMENTlink modified 2.5 years ago by Kevin Blighe60k • written 2.5 years ago by solion0
0
gravatar for Kevin Blighe
2.5 years ago by
Kevin Blighe60k
Kevin Blighe60k wrote:

I would first filter for common variants between both sample groups and then do the pruning. Otherwise, my feeling is that your results would be biased due to the fact that the genotype densities are different. This is possible in PLINK by first outputting the variant IDs for one dataset as a list, and the using this list to filter the other (and vice-versa).

Your large choice for window size is not necessary. The window size relates to # of genotypes / SNPs. Typically, a window size of just 50 (i.e., 50 SNPs) is chosen. You would probably crash your system by choosing 10 000 (?). By choosing 10 000, LD will be calculated on a pairwise basis between all 10 000 SNPs, resulting in 100 000 000 comparisons, which will be repeated many 1 000s of times as the algorithm moves across each chromosome's SNPs.

Take a look at my tutorial, where I actually merge a sample dataset to the 1000 Genomes Phase III: Produce PCA bi-plot for 1000 Genomes Phase III in VCF format

Kevin

ADD COMMENTlink written 2.5 years ago by Kevin Blighe60k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1063 users visited in the last hour