Question: RepeatMasker overlap and interpretation
0
gravatar for Picasa
8 months ago by
Picasa570
Picasa570 wrote:

Dear all,

I have run RepeatMasker and I have this kind of result:

*out file

   SW   perc perc perc  query                 position in query    matching         repeat          position in repeat
score   div. del. ins.  sequence              begin end   (left)   repeat           class/family  begin  end    (left)  ID

  428    7.3 22.8  0.0  ctg1371    230   365 (1868) C rnd-1_family-52  DNA/Maverick  (6794)    181     15   1  
  381   14.8 19.9  1.7  ctg1371    232   382 (1851) C rnd-1_family-50  Unknown        (938)    178      1   2 *

I don't understand why I have 2 different repeat classification and big overlap between these 2.

Is there any filtering to do ? I mean is it possible that one is more wrong than the other, and if yes based on what.

Thanks for your answers.

repeat overlap • 226 views
ADD COMMENTlink modified 8 months ago by lieven.sterck9.0k • written 8 months ago by Picasa570
0
gravatar for lieven.sterck
8 months ago by
lieven.sterck9.0k
VIB, Ghent, Belgium
lieven.sterck9.0k wrote:

From what I can see from that output it does not seem there is a large overlap (~180 bases, no? out of 1900 ).

Also the classification of repeats by RM is not super strict, from the 1900 bases the majority can be quite different causing those two classes not to be catalogued as 1 family. On the other hand, many repeat classes share a substantial part of their content (eg. integrases/RNA polymerases/ ...) so it is not super surprising that they will share some similarity to each other.

ADD COMMENTlink modified 8 months ago • written 8 months ago by lieven.sterck9.0k

Thanks for your answer, it's more clear.

However, sorry but I am not familiar with this output but how did you calculate 1900 bp ?

I have looked at the sequences rnd-1_family-52#DNA/Maverick and rnd-1_family-50#Unknown generated by RepeatModeler and their size are 6975 bp and 1116 bp respectively.

ADD REPLYlink written 8 months ago by Picasa570

yeah, my bad ... was looking at the wrong column, you're indeed correct in respect to their length

ADD REPLYlink written 8 months ago by lieven.sterck9.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1974 users visited in the last hour