How do I compute ld blocks from the hapmap ld_data?
2
0
Entering edit mode
18 months ago
endrebak ▴ 870

When people ask for ld blocks on this forum, this link is often posted:

ftp://ftp.ncbi.nlm.nih.gov/hapmap/ld_data/latest

However, those files merely contain the correlations between snps. How do I use these to compute ld-blocks?

Would be nice if you could post an example script or invocation of a tool, not merely post a link :)

Thank you!

ld snps • 700 views
ADD COMMENT
5
Entering edit mode
18 months ago

The contents of those files are explained in the README; however, I'm not sure how well supported is that format specification (if it is even defined as a format specification).

I have an imputed 1000 Genomes Phase III dataset on my computer that I utilise for various projects —including LD analyses— when needed. If you follow this tutorial, you'll also have this imputed dataset: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2

Once you have all of those files, generating a LD heatmap for any region is as simple as this, taken from my notes:

1, extract variants overlapping the region of interest +/- 1Kbp from the 1000 Genomes Phase III data

bcftools view chr3.1kg.phase3.v5.vcf.gz -R Gene_hg19.bed | \
  bcftools annotate -Ob -I +'%CHROM:%POS:%REF:%ALT' > Region.bcf ;
bcftools index Region.bcf ;

Here, the co-ordinates of the region of interest are stored in the BED file. I also set the VCF ID field to CHROM:POS:REF:ALT if it's not already set

2, extract different populations

bcftools view -Ob --samples-file EAS.list Region.bcf > EAS.bcf ;
bcftools index EAS.bcf ;

Here, EAS.list comprises a list of sample IDs representing the East Asian (EAS) super population. I identified these from 20130606_g1k.ped

3, convert the BCF file to PLINK format (requires PLINK >= 1.9)

plink --noweb --bcf EAS.bcf \
  --keep-allele-order \
  --vcf-idspace-to _ \
  --const-fid \
  --allow-extra-chr 0 \
  --split-x b37 no-fail \
  --make-bed \
  --out EAS ;

4, export data to HaploView format

plink --noweb --bfile EAS \
  --chr 3 --from-bp 135364584 --to-bp 135399507 \
  --snps-only no-DI \
  --recodeHV \
  --out EAS_Haploview ;

5, determine haplotype blocks in HaploView

I have written the following notes:

Start HaploView. In the ‘Welcome to HaploView’ window, select the ‘LINKAGE Format’ tab. Then, click on the ‘browse’ button and select your PED file as ‘Data File’ - this is the PED file that you generated from PLINK.

Click ‘OK’.

Select the ‘LD Plot’ tab to generate a LD heatmap. To save the plot as a Portable Network Graphics (PNG) file, go to File > Export current tab to PNG

6, LD heatmap in R

Then, you can read that HaploView-exported file back into R to mess around with it:

require(png)

par(mar=c(0,0,0,0), mfrow=c(4,1))

haploblocks <- readPNG("EAS_Haploblocks.png")
plot(1:2, type='n', main="", xlab="", ylab="", xaxt="n", yaxt="n", axes=FALSE)
lim <- par()
rasterImage(haploblocks, lim$usr[1], lim$usr[3], lim$usr[2], lim$usr[4]-0.2)

require(SDMTools)
require(RColorBrewer)
points <- cbind(x=c(1.925, 1.96, 1.96, 1.925), y=c(1.5, 1.5, 1.1, 1.1))
legend.gradient(points, cols=brewer.pal(n=9, name="Greys"), title=bquote(R^2), limits=c(0,1))

dddd

NB - the gene track I added also via R, and also via readPNG(). I previously downloaded it from UCSC and saved it as a separate PNG

Kevin

ADD COMMENT
0
Entering edit mode

I thank you for your reply. I should have been more explicit; I wanted bed files with the ld-blocks to use in my analyses.

ADD REPLY
0
Entering edit mode

Then, first, you have to identify the LD blocks, which are calculated at the SNP level and not the base-pair level, per se

ADD REPLY
0
Entering edit mode

Ah, I had no idea. That makes sense :)

ADD REPLY
2
0
Entering edit mode

Great - thank you.

ADD REPLY

Login before adding your answer.

Traffic: 2600 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6