Question: How Do I Search A Genome For A Known Motif, And Get An Interval File Of All Instances Of The Motif?
gravatar for bede.portz
7.0 years ago by
United States
bede.portz490 wrote:

A paper recently identified a motif in Drosophila that is poorly conserved. What I would like to do is search the Drosophila genome for all instances of said motif in a way that allows for mismatches at particular positions, and generate an interval file with the start and end coordinates for all instances of the motif in the genome. In addition to knowing the start and end coordinates, I would like to know the DNA sequence associated with these coordinates, as the motif will often vary from the consensus.

To be clear, what I want to do is the opposite of searching for a motif. I already know the motif, and would like to know all the locations of said motif, and the motif sequence at each location.

I suspect there are tools to do this? But I have not yet conducted any motif analysis, so I would appreciate any help. I tried the search function, but it appears most threads pertain to motif discovery, rather than my particular need.


motif chip-seq • 7.5k views
ADD COMMENTlink modified 6.8 years ago by Ming Tang2.6k • written 7.0 years ago by bede.portz490

Have you looked at the matchPWM() function from the R Biostrings package? It can likely do what you want.

ADD REPLYlink written 7.0 years ago by Devon Ryan98k
gravatar for vj
7.0 years ago by
vj450 wrote:

What you are looking for may be FIMO. You can go to the command line documentation which gives you a number of options although I do not know about mentioning mismatches.

ADD COMMENTlink written 7.0 years ago by vj450
gravatar for Chris Whelan
7.0 years ago by
Chris Whelan550
Portland, OR
Chris Whelan550 wrote:

You could also use the fuzznuc tool from the EMBOSS suite for this:

ADD COMMENTlink written 7.0 years ago by Chris Whelan550
gravatar for Ming Tang
6.8 years ago by
Ming Tang2.6k
Houston/MD Anderson Cancer Center
Ming Tang2.6k wrote:

try if the motif is there

ADD COMMENTlink modified 13 months ago by _r_am32k • written 6.8 years ago by Ming Tang2.6k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1597 users visited in the last hour