Question: Chip-Seq normalization between conditions
0
gravatar for srhic
9 months ago by
srhic40
srhic40 wrote:

Hello,

I have some chip-seq data for three different conditions for which I have plotted RPKM normalized counts around features of interest using deeptools. The plot clearly shows differences in chip-signal between the conditions but I am concerned about different levels of backgrounds between conditions. The blue sample in the plot seems to have lower signal than the other two samples no matter which regions I plot.

Any ideas on how I can normalize the samples so the have the same basal signal? I assume some sort of z-score normalisation may work but am not sure how to do it with my bigwig files.

Thanks

enter image description here

chip-seq deeptools • 341 views
ADD COMMENTlink modified 9 months ago • written 9 months ago by srhic40

What are these samples? I personally like to explore normalization efficiency with MA-plots. A properly-normalized sample should have the majority of data points centered somewhat at y = 0, or at least there should be a somewhat symmetric distribution of the data points around y = 0 depending on how dramatic the changes are between samples. Given you have a count table of normalized counts (not log2 transformed), use for each pairwise comparison:

FoldChange = log2(sample1 / sample2)
AverageCounts = 0.5*log2(sample1 * sampe2)

smoothScatter(AverageCounts, FoldChange)

Without knowing details I canalready predict that naive per-million scaling messes up things and you need a more elaborate normalization strategy, but lets see how the plots look. Is one of these samples an input sample?

By the way you have to paste the full link of the image into the image field. In the above image that would be https://i.ibb.co/6Zyhb1b/chip.png so including the suffix.

ADD REPLYlink modified 9 months ago • written 9 months ago by ATpoint46k

Thanks, the samples are histone marks under three different treatment conditions. I just have the bigwig files output by deeptools. I will try to import them in R and make a count table. Will try and get back.

ADD REPLYlink written 9 months ago by srhic40

Try to make a count matrix based on the merged peaks directly from the BAM files, e.g. using featureCounts. Also see for normalization: A: ATAC-seq sample normalization (quantil normalization) It applies for ChIP-seq as well.

ADD REPLYlink written 9 months ago by ATpoint46k

I am trying out quantile normalization the way you described it for atac-seq. I was also able to get some good results using HOMER. I divided the genome into windows with bedtools and then extracted counts for those windows using HOMER which has an option that allows the counts to be normalized using rlog function of Deseq2. I am not sure I completely understand the rlog normalization or if it is the correct method to use but it made the profiles look much more similar. Will also see how the edgeR approach you described works. Thanks!

ADD REPLYlink written 9 months ago by srhic40
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1090 users visited in the last hour
_