Question: Finding Minisatellite Repeat Motives In Dna
gravatar for Eric Normandeau
10.5 years ago by
Quebec, Canada
Eric Normandeau10k wrote:


I have contigs representing genes of interest that have been 454 sequenced from BAC libraries. The BAC inserts are about 100kb long and end up forming multiple contigs (about 50-100) after assembly. Among these contigs, I some contain the sequence of the gene of interest, including the introns.

I am interested in finding minisatellite repeat motifs, from 10 to 60 bp. I have tried SSR finder before (online: SSR Finder) but apparently it is only for microsatellites (2 to 5 bp). My aim is not to mask them, but to find their position and sequence.

What software would be a good choice in your opinion?

Many thanks

repeats • 3.3k views
ADD COMMENTlink written 10.5 years ago by Eric Normandeau10k
gravatar for Haibao Tang
10.5 years ago by
Haibao Tang3.0k
Mountain View, CA
Haibao Tang3.0k wrote:

There are quite a few softwares out there - see an incomplete list at the bottom of this wiki page. I was quite happy with TRF (Tandem Repeats Finder) in the past. You might want to write a script to post-process the result table to get the range you want.

Did I mention that you can run TRF on a big sequence file, using Kent's tool. I'd create a bed file.

trfBig - Mask tandem repeats on a big sequence file.
   trfBig inFile outFile
This will repeatedly run trf to mask tandem repeats in infile
and put masked results in outFile.  inFile and outFile can be .fa
or .nib format. Outfile can be .bed as well

   -bed creates a bed file in current dir
   -bedAt=path.bed - create a bed file at explicit location
   -tempDir=dir Where to put temp files.
   -trf=trfExe explicitly specifies trf executable name
   -maxPeriod=N  Maximum period size of repeat (default 2000)
ADD COMMENTlink modified 17 months ago by _r_am32k • written 10.5 years ago by Haibao Tang3.0k

Hi @Haibao, Thanks for the TRF suggestion. I also tried mreps from the wiki page you point to and will investigate a few more as necessary. Many thanks!

ADD REPLYlink written 10.5 years ago by Eric Normandeau10k

Hi @Haibao,

Can you tell me where to find the description for bed output from trfBig. I am completely confused with the bed output because there is no header on the file.

ADD REPLYlink modified 2.7 years ago • written 2.7 years ago by kashiff007130
gravatar for biobot 0.0.77.a.1099
10.5 years ago by
biobot 0.0.77.a.10996.1k wrote:

I recommend vmatch / reputer. I've had success repeat finding in bacterial genomes, but it scales very well to larger (much larger) sequences.

ADD COMMENTlink written 10.5 years ago by biobot 0.0.77.a.10996.1k

Hi @Keith, Were you looking for minisats, as I am, or for microsats? Cheers

ADD REPLYlink written 10.5 years ago by Eric Normandeau10k

Minisatellites and larger. This was with the software in its reputer incarnation, which predates vmatch. I don't have any experience with vmatch's huge range of new parameters.

ADD REPLYlink written 10.5 years ago by biobot 0.0.77.a.10996.1k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2034 users visited in the last hour