My ChIP-seq mapped reads are not evenly distributed in R1 and R2 fastq file
1
0
Entering edit mode
4 hours ago
Pallondyle • 0

I'm currently working on ChIP-seq analysis. Following the standard ChIP-seq analysis pipeline, I have generated a nice-looking bigWig peak track. However, when I check the sequences corresponding to these peaks, I find that the reads are only present in the R1.fastq file but not in R2.fastq. As far as I understand, sequencing adapters are randomly ligated to both ends of the ChIP fragments, so the sequences should theoretically be represented in both R1 and R2 fastq files. What could be the possible reason for the phenomenon I observed?

ChIP-seq • 112 views
ADD COMMENT
0
Entering edit mode

My grep results: The forward sequence and reverse sequence:

sgRNA1-chip_R1.fq.gz forward:3 reverse:295263 total:295266
sgRNA1-chip_R2.fq.gz forward:253908 reverse:3 total:253911
ADD REPLY
0
Entering edit mode

I'm performing ChIP-seq for a variant of SpCas9.

ADD REPLY
0
Entering edit mode

Note that a DNA fragment from ChIP is not the same as the sequencing result. If you have, say a 500bp fragment of DNA, and you sequence it, say 2x150bp, then you get 150bp each side, and you miss the 200bp that are not covered by the reads. And then depending on how large the peak is (width) and where the read is positioned, it could be that only R1 picks it up.

|-----------------------------------------------------------------------------------------| PEAK
                                                                                |---------R1---------|///////uncovered//////|---------R2---------|
 |---------R1---------|///////uncovered//////|---------R2---------|
                                                   |---------R1---------|///////uncovered//////|---------R2---------|

In this toy example, only the second one has the peak in both R1 and R2, the others contribute to it partially.

Does this make sense?

ADD REPLY
0
Entering edit mode
59 minutes ago
Ian 6.1k

Points that occur to me:

  • I use properly paired reads, whereby both R1 and R2 are present in the correct orientation on the same chromosome.
  • Peak calling using MACS2/3 converts paired end reads into fragments before finding peaks and calculating fold enrichment. So considering R1 or R2 separately is redundant.
  • The above point makes me wonder which caller you are using and with which settings.
ADD COMMENT

Login before adding your answer.

Traffic: 6086 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6