Question

ChIP-seq in cancer cell lines

0

Entering edit mode

9.4 years ago

Ming Tommy Tang ★ 3.9k

I did not realize that this is a problem. If I do a ChIP-seq using a cancer cell line which harbors numerous translocations and chromosome duplications, how can I map them? Using a non-cancer human reference genome will make the peaks off coordinates. Should I create a reference genome for that cancer cell line?

Or I can still map to the reference genome, and by using whole genome DNA sequencing to identify translocations and duplications. and then move the peaks to the "right" places according to the translocation/duplication information.

Any thoughts on that?

Thank you.

ChIP-Seq • 2.5k views

ADD COMMENT • link updated 8.3 years ago by khaynes ▴ 50 • written 9.4 years ago by Ming Tommy Tang ★ 3.9k

score 1 · Answer 1 · 2014-12-03

1

Entering edit mode

9.4 years ago

Devon Ryan 104k

What's the end goal? If you just want to call differential peaks (e.g., between treated and control samples) then the exact coordinates don't matter. If, however, you're interested in where the peaks are relative to some features (genes, CpG islands, etc.), then you only need to care about structural changes whose end points are extremely close to one of those features (i.e., close enough to matter in the analysis, since the coordinate change could alter the results). In that case, I'd map to the regular reference and just post-process those few sites (the fewer the sites, the easier it is to ensure you've handled them correctly!).

In general, it's probably simplest to not create a new reference unless absolutely needed.

ADD COMMENT • link 9.4 years ago by Devon Ryan 104k

0

Entering edit mode

Thank you for your comment Ryan. I want to identify TF binding sites that are mis-targeted to some genes ( within say 100kb) due to chromosome rearrangement. Do you know any workflow for this purpose?

ADD REPLY • link 9.4 years ago by Ming Tommy Tang ★ 3.9k

0

Entering edit mode

Not off-hand, though I'd be surprised if no one has coded anything (though whether they actually posted and documented it...).

ADD REPLY • link 9.4 years ago by Devon Ryan 104k

score 0 · Answer 2 · 2016-01-21

Unfortunately, the lack of cancer-specific reference genomes is a very big problem that is going to require a coordinated effort and a lot of resources to solve. Sequencing is getting faster, but the assembly stage, from what I understand, is not so fast. It is even more difficult when the genome has a complicated arrangement.

The best you can do for now is align your ChIP-seq reads to something like hg19 and hope for the best. I have seen evidence of non-alignment in my own ChIP-seq of U2OS cells by viewing the WIG data from an input sample as a track in the integrated genome viewer (IGV). I can see some gaps and valleys, whereas theoretically the signal should be even.