Entering edit mode
10.1 years ago
aditi.qamra
▴
270
Hi,
I have a slightly basic question about using DEseq downstream of CCAT -
After calling peaks using CCAT on my tumor and normal samples ( i have 5 such pairs) - Deseq requires a read count in a list of regions (common ?) across conditions ( you want to check for differential peaks in ) . However CCAT has varying regions that it identifies in different samples. So how do I get the read counts for eg. peak regions identified in T1 from the CCAT output of N1 ?
Thanks !
The general idea is to combine the peak calls from all of your samples and then perform the counting based on that. BTW, DESeq(2) will incorrectly perform library-size normalization for your use case, since its assumptions are unlikely to be true for ChIP-seq, so you'll need to either provide your own size factors or also count the off-peaks and then normalize to that. I've never used CCAT or seen its output, so I can't provide any specific advice over exact steps.
Thanks for pointing out the normalisation bit. But the question is how to combine them. CCAT outputs a file with the following header - <chromosome> <position of="" the="" peak=""> <start of="" region=""> <end of="" region=""> <read counts="" in="" chip="" library=""> <read counts="" in="" control="" library=""> <fold-change score=""> <local fdr="">
The regions are going to be slightly different in the output from each sample ( whether tumor or normal) - I could take an union of the peaks from all the tumor biological replicates and likewise for normal. but the regions would still not be necessarily same between T and normal. In that case, Im struggling to understand that for DeSeq how would i get different read counts for common regions for each sample ?
You would take the union of regions from all samples, regardless of treatment group. BTW, I should amend my earlier mention of library-size normalization with the word "may", since whether this will be an issue or not will depend on the dataset.
Oh got it. Thanks ! :)