Question

Looking for advise on histone modification chip-seq analysis - 2 conditions

0

Entering edit mode

4.4 years ago

mariana1988 • 0

Dear all,

Before coming to this forum I searched for quite lots of tutorials in the net, including code examples, videos, manuals etc, but I did not find what I was looking for and this is my last resource, so I would really appreciate if someone could give me some hints for my analysis.

My problem: I performed ChIP-seq with H3K27me3 and H3 (control) antibodies, and I have 2 conditions (wild type and mutant). I also have RNA-seq data, in which I observed downregulation of my genes of interest in the mutant.

What I would like to see if those downregulated genes in the mutant (in respect to WT) are also more enriched in H3K27me3, in comparison to WT.

I observed by using deeptools that in fact, some of my genes of interest are enriched in H3K27me3 in the mutant, but now I would like to get numbers.

So I want to 1)normalize each condition to H3, 2) Compare then mutant and WT (here I´m not looking for presence or absence of peak, but rather how much enrichment there is).

What some people I know are doing is: after mapping and QC, sorting, extending reads, they use bedtools coverage to get a table with counts. From this table they calculate RPKM value for each gene, compare conditions by setting an arbitrary log fold change threshold and with this they get calculate whether there is enrichment or not, and then correlate those genes with RNA seq data. But they don´t include normalization against H3 control.

What is your opinion about this pipeline? Do you have any advises on how to proceed with my analysis?

Thank you very much and kind regards,

ChIP-Seq • 666 views

ADD COMMENT • link 4.4 years ago by mariana1988 • 0

0

Entering edit mode

@ATpoint: thank you very much! I´m following your advise.

Mariana

ADD REPLY • link 4.4 years ago by mariana1988 • 0

0

Entering edit mode

Please use ADD COMMENT and not the answer field for comments. Thanks :)

ADD REPLY • link 4.4 years ago by ATpoint 81k

score 1 · Answer 1 · 2019-11-16

What you want is a standard differential analysis. Look at tools such as csaw or Diffbind which are essentially wrappers for DESeq2 and edgeR. The manuals cover all basic analysis steps. You input a count matrix and eventually get a list of regions with significant changes for H3K27me3 between conditions. From there on simply count overlaps between your genes and those significant regions. That is much more robust that any home-made strategy. Arbitrary thresholds are not informative as high fold changes do not necessarily indicate significant changes but can be artifacts from low counts or outlier samples. These tools have sophisticated normalization methods and statistical frameworks to deal with all of that. I suggest you extensively go through the manuals and then come back with specific questions. For visualization you could do profile plots e.g. with plotprofile from deeptools.