Question: find motif locations in the genome
0
gravatar for igor
3.6 years ago by
igor8.6k
United States
igor8.6k wrote:

There are many motif-finding tools. Usually they compare a set of sequences against a set of motifs and give you the top occurring motifs. However, I have a specific motif in mind and I would like to find the positions where it occurs in the genome (account for mismatches, of course). Is there a tool that will do that?

motif • 2.0k views
ADD COMMENTlink modified 3.6 years ago by Alex Reynolds29k • written 3.6 years ago by igor8.6k
0
gravatar for genomax
3.6 years ago by
genomax72k
United States
genomax72k wrote:

Recent thread that may work: Finding specific k-mer in human genome

ADD COMMENTlink modified 3.6 years ago • written 3.6 years ago by genomax72k

Doesn't k-mer imply that the sequence has to be an exact match?

ADD REPLYlink written 3.6 years ago by igor8.6k

fuzznuc, EMBOSS program referenced in the thread allows you to specify search patterns (with mismatches): http://emboss.sourceforge.net/apps/cvs/emboss/apps/fuzznuc.html I assume you know the sequence of the motif you want to search. You can specify search patterns based on that.

ADD REPLYlink written 3.6 years ago by genomax72k
0
gravatar for Alex Reynolds
3.6 years ago by
Alex Reynolds29k
Seattle, WA USA
Alex Reynolds29k wrote:

The command-line version of UCSC BLAT can be used locally with -minMatch and -minIdentity options to account for mismatches. It exports a PSL file, which contains position information and can be converted into other formats for operations.

ADD COMMENTlink written 3.6 years ago by Alex Reynolds29k

I think blast also should work, doesn't it?

If not what might be the possible problems

ADD REPLYlink written 3.6 years ago by gangireddy160

I think BLAT will give you a bit more control over the number of allowed mismatches and other settings. It also directly outputs positional information, which I do not believe blast does without extra work.

ADD REPLYlink written 3.6 years ago by Alex Reynolds29k

I think BLAT needs a lot more modifications. Setting -minIdentity=60 and -minMatch=1 fails to match a 20bp sequence with just 1 mismatch.

ADD REPLYlink written 3.6 years ago by igor8.6k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1684 users visited in the last hour