Question: How to make and use a RepeatMasker custom library
1
gravatar for TWV
2.9 years ago by
TWV70
TWV70 wrote:

I am trying to run RepeatMasker on an insect genome using the following parameters:

RepeatMasker -species Lepidoptera -nolow -dir RM_Lepidoptera_Intersped -cutoff 250 Genome.fasta

Unfortunately I get no transposon hits (only some rRNA class repeats), I think that this is due to RepeatMasker not using the RepBase library I downloaded, it gives the following message at the start of the analysis:

WARNING: Dfam 2.0 includes repeats found in human, mouse,
         drosophila melanogaster, danio rerio and
         caenorhabditis elegans.  Searching with other species
         will only search for ancestral repeats shared with
         human and your species ( if any exist ) and will use the
         "TC" cutoffs ( trusted cutoff ) instead of the 
         species-specific cutoffs.

Master RepeatMasker Database: /home/wolf/Desktop/Programs/RepeatMasker/Libraries/Dfam.hmm ( Complete Database: Dfam_2.0 )

When using the queryRepeatDatabase.pl utility with parameters:

./queryRepeatDatabase.pl -species Lepidoptera -stat -class DNA -class LTR

It returns a long list of transposons. I'd expect at least one to be present in the sequenced genome...?

So I tried to generate a custom library from data downloaded from RepBase (Hexapoda Transposable elements), but when using the -lib option like so:

RepeatMasker -cutoff 250 -dir RM_Lepidoptera_Intersped -lib RepBase_Hexapoda_TEs.fasta Genome.fasta

I get the error message:

Search Engine: HMMER [ 3.1b2 (February 2015) ]
RepeatMasker::createLib(): Error invoking /home/Programs/HMMER/binaries//hmmpress on file /home/3_Homology_based_approach/RM_31187.ThuNov301438412017/RepBase_Hexapoda_TEs.fasta.

I thought one could use a .fasta library? Or do I need to first convert it? if so how?

ADD COMMENTlink modified 2.7 years ago by Federico Lopez20 • written 2.9 years ago by TWV70
0
gravatar for Federico Lopez
2.7 years ago by
Federico Lopez20 wrote:

When you finish configuring RepeatMasker, it should show you which repeat libraries are installed. For example:

Congratulations!  RepeatMasker is now ready to use.
The program is installed  with a the following repeat libraries:
  Dfam database version Dfam_2.0
  RepeatMasker Combined Database: Dfam_Consensus-20170127, RepBase-20170127

You need to download RepBase and uncompress the file within the RepeatMasker directory. Two files will be added to the Libraries subdirectory.

$ tar -xvzf RepBaseRepeatMaskerEdition-20170127.tar.gz 
Libraries/README
Libraries/RMRBSeqs.embl

Then you can run the script to configure RepeatMasker. Also, for an insect species, you may want to use RepeatModeler to identify repeats de novo, and combine these with repeats from RepBase. Here are two helpful links:

https://blaxter-lab-documentation.readthedocs.io/en/latest/repeat-masking.html http://weatherby.genetics.utah.edu/MAKER/wiki/index.php/Repeat_Library_Construction-Basic

ADD COMMENTlink modified 2.7 years ago • written 2.7 years ago by Federico Lopez20
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1527 users visited in the last hour