When to merge ChIP-Seq data?
Entering edit mode
17 days ago
Thomas • 0


I am a bench scientist new to bioinformatics analysis. I generated a large ChIP-seq dataset of a cytokine-inducible transcription factor in cells, and I'm trying to analyze the data. I could use some advice on how to proceed with the analysis. The dataset contains the following: two untreated samples, two treated samples, and an input DNA sample for each. The eight files are about 570 GB of data unzipped.

So far, I've gotten familiar with the Unix environment, mapped the reads with STAR, called peaks with both macs3 and HOMER, and run motif analysis. After visualizing the peaks in IGV, I can see that the peaks in the treated samples make sense, so it's a good dataset.

Now that I'm slightly more comfortable with these tools, I'd like to be able to provide my PI with a more polished report on the data. This brings me to my question: when do I merge the pairs of biological replicates? I've seen a few different opinions on which is best:

  • merge the bam files after mapping to the genome?
  • merge the bed files after peak calling?
  • merge at a later point?

Ultimately, I'd like to have two merged datasets that I can use to run motif analysis and show differentially enriched genes. If you have further advice or resources to recommend, I'm happy to hear it.

Thank you!

HOMER ChIP-seq macs3 • 136 views

Login before adding your answer.

Traffic: 1162 users visited in the last hour
Help About
Access RSS

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6