Question

Compare replicates to check their correlation

0

Entering edit mode

7.9 years ago

RandManP ▴ 10

I have 9 ChIP-seq data (3 samples , each 3 biological replicates). I want to check their correlation. If I bin the genome to whatever size ( e.g 1000bp ) and count the number of reads per bin, shall I do normalization and then correlation or I do not need to do normalization before?

ChIP-Seq • 5.6k views

ADD COMMENT • link updated 7.9 years ago by ivivek_ngs ★ 5.2k • written 7.9 years ago by RandManP ▴ 10

score 1 · Answer 1 · 2016-06-06

1

Entering edit mode

7.9 years ago

ivivek_ngs ★ 5.2k

You have deeptools to do the exact thing you are intending to do that can check multibam correlation across samples and replicates and plot the correlation heatmap. Check the link here. Check for Correlation between BAM files (multiBamSummary and plotCorrelation)

ADD COMMENT • link 7.9 years ago by ivivek_ngs ★ 5.2k

0

Entering edit mode

I have check it but I did not understated whether it does first normalization or not

ADD REPLY • link 7.9 years ago by RandManP ▴ 10

0

Entering edit mode

The deepTools modules bamCompare and bamCoverage not only allow for simple conversion of BAM to bigWig (or bedGraph for that matter), but also for normalization, such that different samples can be compared despite differences in their sequencing depth.

ADD REPLY • link 7.9 years ago by ivivek_ngs ★ 5.2k

2

Entering edit mode

As vchris states, you can't normalise a BAM file itself (because you normalize signal, not sequencing), and it's inefficient to normalize the signal from a BAM file for just a couple of correlation plots, so better to write out that signal to a bigWig and then use the bigWig for the correlation plot.

Make sure you use --corMethod spearman for the plot though. Using Pearson's for this would be a crime against statistics since the signal is not even close to being either normally distributed or linear. To be honest, using Spearman's isn't great since a big peak almost disappearing would contain roughly the variance as a blip of noise in a gene-desert. You're probably best using deepTools to do the normalization / bigWig creation - and get the Spearman rho while you're there - but then also use the bigWig to do a standard 2-factor distribution-of-variance plot with relative and absolute signal difference. Such a heatmap wouldn't give you a nice single correlation value though - you'd have to look at a lot of these heatmaps to get an idea for what samples look similar and what samples look different, since every ChIP assay would produce a different kind of plot.

ADD REPLY • link 7.9 years ago by John 13k

1

Entering edit mode

ahaaha.. spot on, Pearson method really at times screws up the entire hypthosis. But yes if one is trying to find a plot for the normalized profile then you can create bigwig tracks which are normalized by the size and perform as John states. However if you are interested in just comparing the bam genome wide or even promoter wide then just binning them with a higher bp in bins and see how they correlate with deeptools.

ADD REPLY • link 7.9 years ago by ivivek_ngs ★ 5.2k