I have a draft assembly from Pacbio sequencing. I would like to evaluate the TEs abundance from it.
That's why I would like to use RepeatModeler and RepeatMasker. It does not exist a database corresponding to my species in RepeatMasker.
Should I run RepeatModeler on my draft assembly? And then, use the created database for RepeatMasker? In that way :
BuildDatabase -name my_db -engine ncbi my_draft_assembly.fasta RepeatModeler -engine ncbi -pa 16 -database my_db RepeatMasker -s -xsmall -a -gff -pa 50 -u -lib my_db.fa -dir final_RepeatMasker_out my_draft_assembly.fasta
I will also probably use REPET and/or Repbase to enrich my database. I will need to add the new sequences to the
my_db.fa file, right?