Question

Motif analysis in repeat-rich ChIP-seq data?

0

Entering edit mode

4.9 years ago

robbuurstede • 0

Hi all,

I’m analyzing ChIP-seq data and currently performing a de novo motif analysis using MEME. The output shows a high number of repeat motifs (e.g. GAGAGAGAGAGA) and only as the 10th motif I find the motif of the TF I chipped. The data is very repeat rich, but repeat masking also results in loss of my TF motif as it is within these repeats. I’m afraid that this will not enable me to identify any other co-occurring motifs of interest, so I was wondering if there is a better approach.

Is there a way to tell MEME not to recognize these simple repeats as motifs? I’ve not found out how to do this just yet.

Do you recommend another tool which is more suitable for repeat-rich sequences?

Thank you very much!

Rob

ChIP-Seq motif meme • 1.2k views

ADD COMMENT • link updated 4.9 years ago by simon.vanheeringen ▴ 270 • written 4.9 years ago by robbuurstede • 0

0

Entering edit mode

Can't this be biologically meaningful? Which TF is this?

ADD REPLY • link 4.9 years ago by ATpoint 81k

0

Entering edit mode

It sure could be biologically relevant, but I expected the number one motif to be that of the Glucocorticoid Receptor (the chipped TF).

ADD REPLY • link 4.9 years ago by robbuurstede • 0

score 0 · Answer 1 · 2019-05-15

You probably have the peak coordinates and then used something like bedtools getfasta. What you can do is to first identify the genomic coordinates of these repeats, e.g. using any of the solutions from A: Code golf: detecting homopolymers of length N in the (human) genome (modified to match these tandem patterns you encountered) and then use these coordinates to filter out any peaks that intersect with these blacklisted coordinates e.g. using bedtools intersect. Then get sequences from the remaining peaks and re-run the motif search.

score 0 · Answer 2 · 2019-05-15

0

Entering edit mode

4.9 years ago

Friederike 8.9k

Seems like dust might be a helpful tool for this. You could also browse the excellent MEME suite Q&A page or post your own question there.

ADD COMMENT • link 4.9 years ago by Friederike 8.9k

score 0 · Answer 3 · 2019-05-21

0

Entering edit mode

4.9 years ago

simon.vanheeringen ▴ 270

Try GimmeMotifs. It combines different motif prediction tools (including MEME) and compares the identified motifs to a background set of sequences. It usually work very well for ChIP-seq data (disclaimer: I wrote the software).

ADD COMMENT • link 4.9 years ago by simon.vanheeringen ▴ 270