Question: Nomalize chip-seq data
0
gravatar for Picoskia
2.1 years ago by
Picoskia0
Picoskia0 wrote:

Hi everyone,

I have to analyze 3 chip-seq datasets. I'm fine with the analysis procedure itself but have a question about the normalization. At which step(s) should i normalize my datas?

  • Before the alignment by sequencing depth?
  • At the peak calling step with MACS?
  • At the conversion step from BAM to BigWig?

How do you usually normalize your chip seq datas?

Thanks a lot for your help.

chip-seq • 1.2k views
ADD COMMENTlink modified 2.0 years ago by jared.andrews073.1k • written 2.1 years ago by Picoskia0

Is your 3 ChIPseq dataset for different factor/histone or same factor/histone in different condition ?

ADD REPLYlink written 2.1 years ago by Prakash1.5k

The three ChIPseq data are the same factor : one WT and two with mutations

ADD REPLYlink written 2.0 years ago by Picoskia0

MACS includes some basic normalization in case you provide a control file. In that case, the larger file is proportionally scaled towards the smaller one. If your goal is [and this is what you should state right away when posting a question like this] to simply call peaks, this is typically sufficient. If you aim to perform differential analysis, have a look at the established tools, like MAnorm, csaw, DiffBind and many more.

ADD REPLYlink written 2.1 years ago by ATpoint22k
1
gravatar for jared.andrews07
2.0 years ago by
jared.andrews073.1k
St. Louis, MO
jared.andrews073.1k wrote:

MACS normalizes when calling peaks fairly well, though their normalized bedgraphs frankly look terrible on a browser. I use deepTools to create read-depth normalized bigWig files that look much more appropriate in UCSC. deepTools has a few different ways it can normalize, including subtracting input reads from samples, though I typically just use the rpkm option.

If you want to quantitatively compare signal at ChIP-seq peaks, my two favorite tools are DiffBind (R package) if you have biological replicates or MAnorm (Bash/R scripts) if you're trying to compare a single sample to another. They both take care of normalization and do a pretty good job of identifying unique peaks for a given condition/sample.

ADD COMMENTlink written 2.0 years ago by jared.andrews073.1k

can you provide some link where i can read about both chip seq and ATAC seq data analysis I did use homer but im not yet confident about it

ADD REPLYlink written 2.0 years ago by krushnach80580

I typically treat ATAC-seq much the same as ChIP-seq, but use a smaller extension size during peak calling for ATAC-seq, as our fragments are usually smaller. HOMER is also a perfectly good tool (with great documentation), though it can't quantitatively compare signal at peaks last I checked. I found this paper very helpful when trying to identify which tool is best for the job depending on your data type (sharp vs broad signal), if you have replicates, etc.

There are tons of other blogs/githubs/websites that go more deeply into analysis, including the BioStars handbook. This github also has links and some comments about pretty much every tool ever developed for ChIP-seq analysis along with tons of links to other resources, key papers, etc. It's a great resource.

ADD REPLYlink written 2.0 years ago by jared.andrews073.1k

ENCODE has pipelines and documentation for this:

https://github.com/mforde84/ATACseq-analysis-pipeline

ADD REPLYlink written 2.0 years ago by mforde841.2k
0
gravatar for rahul
2.1 years ago by
rahul0
rahul0 wrote:

As suggested, MACS/MACS2 will normalize according to the total number of reads. Some of the bigWig creation packages also have the ability to scale by a specified normalization factor, which you will have to do to get a "normalized" bigWig file.

One last thing: if you are looking at a global increase or reduction of whatever you are ChIPping, total read normalization will not work. Something to keep in mind...

ADD COMMENTlink written 2.1 years ago by rahul0

Thank you for your answers, so if i understand the normalization step has to be done after the alignment when calling peaks with MACS.

ADD REPLYlink written 2.0 years ago by Picoskia0

if you are looking at a global increase or reduction of whatever you are ChIPping, total read normalization will not work.

This depends on the nature of the ChIP. Transcription factor ChIP-seq often have relatively few (<20K) enriched regions, which should not influence the global scaling approaches too much. Broad histone marks covering large swathes of the genome (e.g. K27me3) can be a different story, though.

ADD REPLYlink written 17 months ago by Friederike5.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2547 users visited in the last hour