Hi,
I've done CUT&RUN for my target of interest in siSCR and siBRCA2 cells. I used SEACR for peak calling where the peaks were normalised to IgG using the stringent conditions and I got approx. 450 peaks for siSCR and 7500 peaks for siBRCA2 (after taking into consideration the peaks that overlap between 2 out of 3 replicates). I do expect my target of interest to be enriched in the siBRCA2 cells.
When I plot the metagene profile, I clearly see an enrichment of my signal at the TSS in the siBRCA2 cells compared to siSCR and this is also true when I visualise the bigwig files across several genes.
The issue is when I want to plot the distribution of the peaks across the different genomic features (promoters, 5'UTR, 3'UTR, exon, intron, etc.), I see that the majority of the peaks are at the promoters, however, the percentage of peaks in the promoter region is slightly lower in the siBRCA2 cells compared to the siSCR cells.
The question is:
- How do I assess that this is significant? It is 91.5% in siSCR and 88.9% in siBRCA2. Should I do this with the individual replicate files?
- Is this an accurate way to represent it? Or am I misinterpreting / missing out on something in my interpretation of the data?
I would highly appreciate your help.
Thank you.
If I understand correctly, you see increased signal at TSS sites compared to control (taller peaks in browser). However, 92% of the 450 siSCR peaks are in TSS and 88.9% of the 7500 peaks are in siBRCA2 peaks are in TSS.
I wouldn't focus on this slight difference and instead just conclude promoters are bound in both conditions, but the overall level of binding is greater in siBRCA2. You can consider differential binding analysis. See to get started, Using featureCounts and DESeq2 to look at differences between ATAC-seq conditions.
With such an increase, I may expect protein expression increased too which could be cool to show.