Question: What Could Be The Reason For Spliced Alignments In Chip-Seq Data?
gravatar for Mikael Huss
6.7 years ago by
Mikael Huss4.7k
Mikael Huss4.7k wrote:

I am looking at a ChIP-seq data set where, for one of the suspected target genes, we see a coverage profile that looks suspiciously like RNA-seq data, i.e. the reads are lining up very regularly along the exons as opposed to the usual peaky profile that one would expect in ChIP-seq. On further inspection, we also find that using TopHat, we find a handful of spliced alignments joining the same two exons in the gene. (Initially we had used a different aligner; this was just for checking the potential artifact I am describing.)

Now, I have heard of genomic DNA contamination in RNA-seq libraries, but I have a harder time figuring out how one can get RNA (or rather cDNA, I suppose) contamination in a ChIP-seq library. Any ideas where this might come from?

splicing chip-seq • 2.2k views
ADD COMMENTlink modified 6.7 years ago by black_hoodies0 • written 6.7 years ago by Mikael Huss4.7k

I have had the same problem, but it is predominantly in the input and not the ChIP-seq data. I have been told that the Taq polymerase used for deep seq library preparation may be able to synthesize a small amount of DNA from an RNA template, and that RNase treatment of the ChIP input DNA is needed. We haven't tested whether this is the case yet.

ADD REPLYlink written 5.7 years ago by diane.krause10

Interesting, thanks for the comment!

ADD REPLYlink written 5.7 years ago by Mikael Huss4.7k

Do you have control channel data? What do these regions look like in those experiments? There are a fair number of edge cases where repetitive sequences might generate such patterns, or nonspecific binding over an interval could occur.

The splice junctions are more interesting / worrying, but maybe you'd start thinking about viral integration events or other transposon-like events. It's not clear what would cause the ChIP enrichment though, at least to me.

ADD REPLYlink written 6.7 years ago by matted7.2k

There are IgG controls where I haven't looked at these regions yet. Thanks for the suggestion. Yes, I was considering viral integration events, but I am not sure what conclusions to draw from that.

ADD REPLYlink written 6.7 years ago by Mikael Huss4.7k

Did you ever manage to figure out a solution to this? I have a very similar behaviour in the Arabidopsis ChIP-Seq data that I am currently looking at, the genes that show this are ones that are transcription factors that have known important functions in the tissue we are looking at.
I see this in the sample and the anti-HA control, but not the Input, rows in the image are sample, Input, anti-HA.

I'm also noticing that they don't seem to have the SNPs that are present in the Input.

ADD REPLYlink written 5.7 years ago by simon.pearce20

Not really - we have just assumed that we are dealing with some sort of artifact and disregarded this particular locus. Meanwhile, I have seen and read this paper which might be relevant: Highly expressed loci are vulnerable to misleading ChIP localization of multiple unrelated proteins. I don't think that would explain your "missing SNPs" though. That is an interesting observation which I didn't see in my data (whether it's there or not).

ADD REPLYlink written 5.7 years ago by Mikael Huss4.7k
gravatar for black_hoodies
6.7 years ago by
black_hoodies0 wrote:

I don't know what you mean regarding the spliced alignments joining the same two exons in the gene, however have you perhaps considered that the "regular" alignments are in fact PCR-duplicates?

ADD COMMENTlink written 6.7 years ago by black_hoodies0

I don't think PCR duplication is the problem, as the picture is close to identical after deduplication. That also wouldn't explain the split-read alignments (which are by the way also not PCR duplicates as they have distinct starting positions although the spliced-out part [i.e. intron] is the same in each case.)

ADD REPLYlink written 6.7 years ago by Mikael Huss4.7k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2304 users visited in the last hour