Hello Everyone, I am performing ChIP Seq analysis for histone modifications, H3K27me3 and H3K27Ac. I am using Input as a control for peak calling and the subsequent steps. I get only 40 - 50 % aligned reads to the genome for each ChIP sample whereas the alignment percentage in the Input is in the nineties. I am using bowtie2 to perform alignments with default parameters. The sequence length is 150, There is no adapter contamination which I confirmed using fastQC. There is some issue with k-mers in the starting 9 bases of the reads where they show an increased frequency of occurrence but altogether such reads are less than 1 % of the total in the dataset. Has anyone else observed something similar? Does anyone know the possible reasons for this? I suspect it might be something introduced in the experimental steps especially since it is common across both antibodies. Preclearing usually has salmon sperm DNA in it to remove nonspecific DNA binding proteins. I wonder if that could carry over into the library preparation step. Preclearing is not performed for Input usually. Apart from this, can anyone think of another reason? Any help is greatly appreciated. Ashwin
I got a similar problem and found human contamination (I study Drosophila). I would recommend to use fastq_screen, it will do a very basic alignment to different genomes. If that doesn't account for the ~50% unaligned reads, then blast some of the unaligned reads.