Traffic: 111 ip/hr
Question: Examples of DNA sequence motif sets for testing search algorithm
 
1
 
 

This is a followup to Resurrecting DNA motif finding project.

I'm looking for sets of aligned DNA sequence motifs to use for testing my search algorithm. This algorithm looks for correlations across the whole motif, so it performs best if

a) The length of the motif is small. Say between 10 and 30 characters long, preferably. Anything shorter or longer would probably not work well.

b) The set is large. Ideally several hundred. The longer the motif, the larger the set needs to be.

If you know of motifs like these, please list them. It would be helpful if a link could be provided to the data, preferably as a FASTA file, and also a description of the biological significance of the motifs. A description of the conserved regions would also be helpful.

I've not a biologist, so please don't assume a lot of biological background. Thanks.

3 answers

 
3
 
 

You might take a look at the JASPAR database, if I understand your question correctly.

 

Thanks Sean. This looks interesting. Now all I have to do is figure out what I need to download... I can't tell if FASTA files are available - I don't see them.

log in to reply • written 19 months ago by Faheemmitha  1205
 

The files in http://jaspar.genereg.net/html/DOWNLOAD/sites/ look like FASTA files, though they are labelled *.sites. Have I got this correct? Is this what I need?

log in to reply • written 19 months ago by Faheemmitha  1205
 

Yes, those are FASTA format files.

log in to reply • written 19 months ago by Sean Davis  11,800314
 
 
1
 
 

You might find data from the UNIPROBE project useful (but it is mostly mouse based).

 
 
0
 
 

You should look at Prosite, which is the database of Protein Domain Profiles from the same institute as Uniprot.

Unfortunately, I think that most of the DNA regulatory motifs are smaller than 10 nucleotides. For example, the splicing signals are usually composed of many short degenerated motifs, that interact together.

 
Log in to add a post