Approach to Differential analysis of RNA, ATAC, and ChIP-Seq with known CNVs between groups
0
1
Entering edit mode
4.1 years ago
millerh1 ▴ 40

Hello,

I have two cell lines (clones derived from patient with mutation; mutation was rescued in one cell line but not the other) with ChIP, ATAC, and RNA-Seq in each -- and I want to compare them. The differential analysis yielded lots of pathways like "Chr9p13" and "Amplicon Chr6q22 in Breast Cancer", which lead to us to consider that there might be CNVs between these cell lines. We did a high-coverage WGS run to confirm this -- we ended up finding large-scale copy number changes between them.

Now that I know what the copy number changes are, I am wondering if anyone is aware of tools/approaches to using that information in re-analyzing our ChIP, ATAC, and RNA Seq data?

This was the approach I was considering at first: (1) Normalizing the ChIP and ATAC Seq by using the WGS as the Input for peak calling OR (1) Take the results from something like DiffBind and weighing the fold change and p value for differential peaks in CNV regions by simply dividing the -log10(pVal) and fold change by the copy fold change at the location and (2) Maybe don't normalize the RNA-Seq by CNV because the CNVs lead to biologically-meaningful transcriptomic differences

Thank you for your time, Henry Miller

ChIP-Seq genome RNA-Seq sequencing • 1.2k views
ADD COMMENT
1
Entering edit mode

Maybe you can get some inspiration from the Bioc thread where Aaron Lun (csaw and other packages author) comments on a similar scenario. In that case I asked about trisomy of an entire chromosome towards normalization. https://support.bioconductor.org/p/127168/ Maybe this solution with the offset matrix might help if these CNVs are very large so that you have enough counts for them. Doing so you could specifically eliminate the effect of the CNVs on the counts of your ATAC/ChIP-seq experiment. Aaron is typically very responsive if you invest some effort into your questions, so if you have a specific strategy in mind and want experts opinions you might post it at over at Bioc and hope he has a look. He is (from what I know) not active here at Biostars though. Still, these CNVs could of course be biologically-meaningful as you say, and my comment only addresses the part on how to reduce the effect of CNVs if you regard them as source of bias.

ADD REPLY
0
Entering edit mode

Thank you for the response! I really like the offset matrix idea -- In my case it would require normalizing dozens of different regions separately, but it sounds like it still wouldn't violate the assumptions of csaw. I'm going to try that out and give the results here and probably post in bioc as well.

ADD REPLY
1
Entering edit mode

It might be worth investing some time to see if the CNV regions even contain candidate peaks that have a fair chance of being differential. If these CNV only contain like 10 peaks (so low peak numbers) or even if they contain many peaks but with really low counts it might not even be worth the hustle. What I want to say is, before investing a lot of effort, try to justify that it is worth the time. I had it often in the past that I overcomplicated things.

ADD REPLY
1
Entering edit mode

I expect that this is a significant confounding factor in ChIP-seq analyses for cancer samples, particularly if you're doing anything with super enhancers. It's not published, but I can say that a not insignificant portion of "cancer" super enhancers are due to amplifications, which is somehow glossed over in most publications. Good on ya for thinking about this, though as mentioned, it is a challenge to deal with using standard tools.

ADD REPLY
0
Entering edit mode

Thank you so much for your perspective on this -- after digging further into the genes with genuine CNVs, the list was actually quite small (around 500 or so). When I simply removed these genes from all my analyses, the results really didn't change much. This saved me a ton of time -- thanks again!

ADD REPLY

Login before adding your answer.

Traffic: 3285 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6