Question

SNPs in LD-independent loci

1

Entering edit mode

9.4 years ago

jpsangio ▴ 10

I would like to obtain an accurate estimate of the number of SNPs in my SNP set that occupy LD-independent loci.

I am using a SNP set culled from large GWA data set (selection of gene-resident variants was based on an a priori hypothesis) to examine SNP-trait associations -- we feel the the 5E-8 alpha testing level is too stringent in this case, as we are examining association tests on ~13,000 SNPs. We'd like to get an idea of how many and which of these are in nearly complete LD (r2 >= 0.80 in CEU).

THE CHALLENGE

I have access a to list of SNPs from a public source with values for marker ID & p-value.
I DO NOT have access to genotype values for individuals (cannot compute linkage structure).

Essentially we want to select SNPs representative of LD-independent loci and compute Q-values on the SNP-trait tests. Is there a way to take my SNP set marker ID values and obtain information on LD structure in my sample using a proxy sample? The CEU or CEU+TSI would be appropriate as a reference.

SNP GWAS LD • 5.1k views

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.4 years ago by jpsangio ▴ 10

0

Entering edit mode

I would suggest you look into haplotype analysis. LD structure is only present among SNPs which are relatively close to another. You could phase your samples and then derive haplotypes from phased reference data such as 1000genomes. Then you will be able to assign haplotypes to your samples and test for association between those and your trait of interest. This approach would account for LD structure. Depending on the data, the number of haplotypes can be a lot smaller than the number of SNPs you are testing and therefore will decrease your multiple testing penalty.

ADD REPLY • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by lkmklsmn ▴ 970

Ram · Answer 1 · 2014-12-15

0

Entering edit mode

9.3 years ago

Sean Davis 26k

You can "prune" your SNPs based on measures of LD. Plink supports pruning.

I have also done pruning in R using readVcf for importing 1000genomes data for your SNPs of interest, snpMatrix for LD calculations, and a short script to recursively prune SNPs based on LD thresholds.

ADD COMMENT • link updated 2.1 years ago by Ram 43k • written 9.3 years ago by Sean Davis 26k