Question: Discrepencies between RepeatMasker searches with de-novo and repeatmasker libs
6.7 years ago by
European Union
amirsh60 wrote:

I am in the process of annotating a non-vertebrate genome assembly with TEs. I started with RepeatModeler to generate a library. Then I searched my assembly using RepeatMasker, once with the new library and again with -species eukaryota. The first (-lib) search has many more matches than the second, which is expected, but very few matches are identified and the .tbl file indicates that the query was treated as human (despite the lib file I have passed). In the second search (-species eukaryota) the total number of matches is hundredfold lower (which makes sense) but the number of the identified matches is much higher. I tried reruning RepeatModeler while passing '-sepcies eukaryota' and then also edited the RepeatMasker commandline in the RepeatClassifier script (used by RepeatModeler) to include '-sepcies eukaryota'. Niether of the attempts made a difference.

I considered identifying the TEs in the denovo lib using censor. This resulted in a good proportion of classified consensi with resonable identifications, but I am not sure how to generate a RepeatMasker formatted library with the Censor classifications easily.

If anyone can say what I am doing wrong in RepeatModeler or how can I use the classifications from Censor to generate a library that can be used in RepeatMasker, I will be very greatfull for hints.

transposable elemetns
