I have been processing paired end 150bp ATAC-seq data, but failing to get peaks at known promoters, the data just looks like noise through out.
Started with QC, where reads had poly(G) towards their ends, this is known to happen in NovaSeq (two-color chemistry). Trimmed reads using cutadapt to remove adapter sequences and poly(G). After which I proceed with alignment using bowtie2, alignment rate ranged from ~98 to 99% for all samples and mapped reads varied from 70 to 150 million. Mitochondrial content ranged from 10~50%.
Below is the image of Bio-analyzer run for a sample and the fragment size distribution for the same.
IGV view showing BAM coverage for different samples at GAPDH promoter
I performed peak calling, got around ~300 to 2000 peaks, they seem to be noise when I cross-check some of them on the viewer/genome browser bigwig tracks. Diffbind analysis gave no significantly different peaks between 2 sample groups.
Has anyone ever come across such a problem? What could have gone wrong in the experiment or data analysis part?