Question

My ChIP-seq mapped reads are not evenly distributed in R1 and R2 fastq file

0

Entering edit mode

4 hours ago

Pallondyle • 0

I'm currently working on ChIP-seq analysis. Following the standard ChIP-seq analysis pipeline, I have generated a nice-looking bigWig peak track. However, when I check the sequences corresponding to these peaks, I find that the reads are only present in the R1.fastq file but not in R2.fastq. As far as I understand, sequencing adapters are randomly ligated to both ends of the ChIP fragments, so the sequences should theoretically be represented in both R1 and R2 fastq files. What could be the possible reason for the phenomenon I observed?

ChIP-seq • 112 views

ADD COMMENT • link updated 59 minutes ago by Ian 6.1k • written 4 hours ago by Pallondyle • 0

0

Entering edit mode

My grep results: The forward sequence and reverse sequence:

sgRNA1-chip_R1.fq.gz forward:3 reverse:295263 total:295266
sgRNA1-chip_R2.fq.gz forward:253908 reverse:3 total:253911

ADD REPLY • link 4 hours ago by Pallondyle • 0

0

Entering edit mode

I'm performing ChIP-seq for a variant of SpCas9.

ADD REPLY • link 4 hours ago by Pallondyle • 0

0

Entering edit mode

Note that a DNA fragment from ChIP is not the same as the sequencing result. If you have, say a 500bp fragment of DNA, and you sequence it, say 2x150bp, then you get 150bp each side, and you miss the 200bp that are not covered by the reads. And then depending on how large the peak is (width) and where the read is positioned, it could be that only R1 picks it up.

|-----------------------------------------------------------------------------------------| PEAK
                                                                                |---------R1---------|///////uncovered//////|---------R2---------|
 |---------R1---------|///////uncovered//////|---------R2---------|
                                                   |---------R1---------|///////uncovered//////|---------R2---------|

In this toy example, only the second one has the peak in both R1 and R2, the others contribute to it partially.

Does this make sense?

ADD REPLY • link 1 hour ago by ATpoint 90k

score 0 · Answer 1 · 2025-11-13

Points that occur to me:

I use properly paired reads, whereby both R1 and R2 are present in the correct orientation on the same chromosome.
Peak calling using MACS2/3 converts paired end reads into fragments before finding peaks and calculating fold enrichment. So considering R1 or R2 separately is redundant.
The above point makes me wonder which caller you are using and with which settings.