Question: find motif locations in the genome
0
gravatar for igor
4.6 years ago by
igor11k
United States
igor11k wrote:

There are many motif-finding tools. Usually they compare a set of sequences against a set of motifs and give you the top occurring motifs. However, I have a specific motif in mind and I would like to find the positions where it occurs in the genome (account for mismatches, of course). Is there a tool that will do that?

motif • 2.8k views
ADD COMMENTlink modified 4.6 years ago by Alex Reynolds31k • written 4.6 years ago by igor11k
0
gravatar for genomax
4.6 years ago by
genomax91k
United States
genomax91k wrote:

Recent thread that may work: Finding specific k-mer in human genome

ADD COMMENTlink modified 4.6 years ago • written 4.6 years ago by genomax91k

Doesn't k-mer imply that the sequence has to be an exact match?

ADD REPLYlink written 4.6 years ago by igor11k

fuzznuc, EMBOSS program referenced in the thread allows you to specify search patterns (with mismatches): http://emboss.sourceforge.net/apps/cvs/emboss/apps/fuzznuc.html I assume you know the sequence of the motif you want to search. You can specify search patterns based on that.

ADD REPLYlink written 4.6 years ago by genomax91k
0
gravatar for Alex Reynolds
4.6 years ago by
Alex Reynolds31k
Seattle, WA USA
Alex Reynolds31k wrote:

The command-line version of UCSC BLAT can be used locally with -minMatch and -minIdentity options to account for mismatches. It exports a PSL file, which contains position information and can be converted into other formats for operations.

ADD COMMENTlink written 4.6 years ago by Alex Reynolds31k

I think blast also should work, doesn't it?

If not what might be the possible problems

ADD REPLYlink written 4.6 years ago by gangireddy160

I think BLAT will give you a bit more control over the number of allowed mismatches and other settings. It also directly outputs positional information, which I do not believe blast does without extra work.

ADD REPLYlink written 4.6 years ago by Alex Reynolds31k

I think BLAT needs a lot more modifications. Setting -minIdentity=60 and -minMatch=1 fails to match a 20bp sequence with just 1 mismatch.

ADD REPLYlink written 4.6 years ago by igor11k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 985 users visited in the last hour