I'm hoping someone can help me here, I carried out two CTCF ChIPSeq replicates and the data generated is noisy, as it stand I can only see about 10% of the peaks using SISSR peak calling. I know this signal is bonafide as when I extract the sequences and run a motif analysis I can see clear CTCF binding motif association. Has anyone got any experience dealing with noisy data? The fact that I have two replicates should help me but Im relatively new to bioinformatics and don't really know where to go from here.
Any advise greatly appreciated!
Use CHANCE to see how the enrichment is of your signal to understand if its a quality to infer peaks or not. Alternatively how much deep is your data? Might be the depth is not enough for calling the maximum number of peaks.
Assess the enrichment of thise peaks and view them in broswer and proceed to see what kind of enrichment you can see.
You can also you deeptools to assess the QC status.
Finally I would also ask you to play with parameters for fine refinement of peak calling and also try to use other peak callers (rseg, macs2.1, pepr and many more) and see what their result is. Good luck!
Along with vchris_ngs's reply. You might want to take a look at the FastQC for your raw data. When you mapped, what was the overall alignment rate? High, low? Have you tried merging your two samples together?