Question

Running RepeatMasker with RepeatModeler database

0

Entering edit mode

3.7 years ago

pablo ▴ 350

Hello,

I have a draft assembly from Pacbio sequencing. I would like to evaluate the TEs abundance from it.

That's why I would like to use RepeatModeler and RepeatMasker. It does not exist a database corresponding to my species in RepeatMasker.

Should I run RepeatModeler on my draft assembly? And then, use the created database for RepeatMasker? In that way :

 BuildDatabase -name my_db -engine ncbi my_draft_assembly.fasta

 RepeatModeler -engine ncbi -pa 16 -database my_db

 RepeatMasker -s -xsmall -a -gff -pa 50 -u -lib  my_db.fa -dir final_RepeatMasker_out my_draft_assembly.fasta

I will also probably use REPET and/or Repbase to enrich my database. I will need to add the new sequences to the my_db.fa file, right?

Best

repeatmasker repeatmodeler • 2.4k views

ADD COMMENT • link updated 3.7 years ago by Juke34 9.3k • written 3.7 years ago by pablo ▴ 350

score 0 · Answer 1 · 2021-10-28

Should I run RepeatModeler on my draft assembly? And then, use the created database for RepeatMasker?

Yes, see here for more details on how to create de novo repeat ibrary using RepeatModeler: Create de novo repeat library

I will also probably use REPET and/or Repbase to enrich my database. I will need to add the new sequences to the my_db.fa file, right?

Yes you can do like that.

You can also consider using EDTA to create your repeat library.