Hello,
I have recently undertaken reanalyzing a transcription factor ChIPseq done in Xenopus laevis. I started with the raw data, which includes two replicates for a transcription factor chip, and a DNA input control file. These are all fastq files. After aligning and going through the Homer makeTagDirectory and findPeaks , I am unable to get the same # of peaks that as author of the original paper from which the data came from (I get ~100, he got ~12,000). If I run callpeaks without the input control, I get ~12,000 peaks. If I look closely, it seems that most of the peaks get thrown out during the input based filtering. I've also noticed that during alignment using Bowtie2, ~50% of the reads align more than once. I think this is due to the tetraploid nature of the X. laevis genome, and i'm worried that this means real reads are getting tossed.
Any ideas?