Question: Eliminating repetetive motifs found by MEME
0
gravatar for rbronste
5 months ago by
rbronste160
rbronste160 wrote:

Whenever I run MEME I get several top E-value hits that are pure repeats (often 25bp or longer) and I would like to eliminate these from the search and keep only actually possible TF motifs. Is there a way to do this? Thanks.

chip-seq meme motif • 250 views
ADD COMMENTlink modified 5 months ago by Alex Reynolds22k • written 5 months ago by rbronste160
2
gravatar for Alex Reynolds
5 months ago by
Alex Reynolds22k
Seattle, WA USA
Alex Reynolds22k wrote:

Unless you're looking for de novo motif models, one option is to use TOMTOM to rank MEME hits by nearness to published TF databases. That should clean things up considerably.

Unless you're looking for really long binding sites (dimers, say) you could also set the -maxw parameter in MEME so that you're looking for sites that are less than 25nt long. Tuning other parameters may be of use.

Another option is to adjust your background, by removing sequences from repeat-masked regions.

ADD COMMENTlink written 5 months ago by Alex Reynolds22k

Great suggestions. Do you know of a good source to obtain a mouse (mm10) bed file of repeat-masked regions? UCSC I guess?

ADD REPLYlink written 5 months ago by rbronste160

UCSC would be my first stop. Others might suggest Biomart, maybe.

$ wget -qO- http://hgdownload.cse.ucsc.edu/goldenPath/mm10/database/rmsk.txt.gz | gunzip -c | awk -v OFS="\t" '{ print $6,$7,$8,$11,$2,$10; }' | sort-bed - > rmsk.bed
$ bedops --merge rmsk.bed > rmsk.mergedRegions.bed
$ bedops --difference myRegions.bed rmsk.mergedRegions.bed > myRegions.masked.bed
$ bed2faidx.pl --options... < myRegions.masked.bed > myRegions.masked.fa

Etc.

ADD REPLYlink modified 5 months ago • written 5 months ago by Alex Reynolds22k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1258 users visited in the last hour