How do I compute ld blocks from the hapmap ld_data?
Entering edit mode
18 months ago
endrebak ▴ 870

When people ask for ld blocks on this forum, this link is often posted:

However, those files merely contain the correlations between snps. How do I use these to compute ld-blocks?

Would be nice if you could post an example script or invocation of a tool, not merely post a link :)

Thank you!

ld snps • 700 views
Entering edit mode
18 months ago

The contents of those files are explained in the README; however, I'm not sure how well supported is that format specification (if it is even defined as a format specification).

I have an imputed 1000 Genomes Phase III dataset on my computer that I utilise for various projects —including LD analyses— when needed. If you follow this tutorial, you'll also have this imputed dataset: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2

Once you have all of those files, generating a LD heatmap for any region is as simple as this, taken from my notes:

1, extract variants overlapping the region of interest +/- 1Kbp from the 1000 Genomes Phase III data

bcftools view chr3.1kg.phase3.v5.vcf.gz -R Gene_hg19.bed | \
  bcftools annotate -Ob -I +'%CHROM:%POS:%REF:%ALT' > Region.bcf ;
bcftools index Region.bcf ;

Here, the co-ordinates of the region of interest are stored in the BED file. I also set the VCF ID field to CHROM:POS:REF:ALT if it's not already set

2, extract different populations

bcftools view -Ob --samples-file EAS.list Region.bcf > EAS.bcf ;
bcftools index EAS.bcf ;

Here, EAS.list comprises a list of sample IDs representing the East Asian (EAS) super population. I identified these from 20130606_g1k.ped

3, convert the BCF file to PLINK format (requires PLINK >= 1.9)

plink --noweb --bcf EAS.bcf \
  --keep-allele-order \
  --vcf-idspace-to _ \
  --const-fid \
  --allow-extra-chr 0 \
  --split-x b37 no-fail \
  --make-bed \
  --out EAS ;

4, export data to HaploView format

plink --noweb --bfile EAS \
  --chr 3 --from-bp 135364584 --to-bp 135399507 \
  --snps-only no-DI \
  --recodeHV \
  --out EAS_Haploview ;

5, determine haplotype blocks in HaploView

I have written the following notes:

Start HaploView. In the ‘Welcome to HaploView’ window, select the ‘LINKAGE Format’ tab. Then, click on the ‘browse’ button and select your PED file as ‘Data File’ - this is the PED file that you generated from PLINK.

Click ‘OK’.

Select the ‘LD Plot’ tab to generate a LD heatmap. To save the plot as a Portable Network Graphics (PNG) file, go to File > Export current tab to PNG

6, LD heatmap in R

Then, you can read that HaploView-exported file back into R to mess around with it:


par(mar=c(0,0,0,0), mfrow=c(4,1))

haploblocks <- readPNG("EAS_Haploblocks.png")
plot(1:2, type='n', main="", xlab="", ylab="", xaxt="n", yaxt="n", axes=FALSE)
lim <- par()
rasterImage(haploblocks, lim$usr[1], lim$usr[3], lim$usr[2], lim$usr[4]-0.2)

points <- cbind(x=c(1.925, 1.96, 1.96, 1.925), y=c(1.5, 1.5, 1.1, 1.1))
legend.gradient(points, cols=brewer.pal(n=9, name="Greys"), title=bquote(R^2), limits=c(0,1))


NB - the gene track I added also via R, and also via readPNG(). I previously downloaded it from UCSC and saved it as a separate PNG


Entering edit mode

I thank you for your reply. I should have been more explicit; I wanted bed files with the ld-blocks to use in my analyses.

Entering edit mode

Then, first, you have to identify the LD blocks, which are calculated at the SNP level and not the base-pair level, per se

Entering edit mode

Ah, I had no idea. That makes sense :)

Entering edit mode

Great - thank you.


Login before adding your answer.

Traffic: 2600 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6