RepeatMasker overlap and interpretation
1
0
Entering edit mode
4.1 years ago
Picasa ▴ 640

Dear all,

I have run RepeatMasker and I have this kind of result:

*out file

   SW   perc perc perc  query                 position in query    matching         repeat          position in repeat
score   div. del. ins.  sequence              begin end   (left)   repeat           class/family  begin  end    (left)  ID

  428    7.3 22.8  0.0  ctg1371    230   365 (1868) C rnd-1_family-52  DNA/Maverick  (6794)    181     15   1  
  381   14.8 19.9  1.7  ctg1371    232   382 (1851) C rnd-1_family-50  Unknown        (938)    178      1   2 *

I don't understand why I have 2 different repeat classification and big overlap between these 2.

Is there any filtering to do ? I mean is it possible that one is more wrong than the other, and if yes based on what.

Thanks for your answers.

repeat overlap • 1.4k views
ADD COMMENT
0
Entering edit mode
4.1 years ago

From what I can see from that output it does not seem there is a large overlap (~180 bases, no? out of 1900 ).

Also the classification of repeats by RM is not super strict, from the 1900 bases the majority can be quite different causing those two classes not to be catalogued as 1 family. On the other hand, many repeat classes share a substantial part of their content (eg. integrases/RNA polymerases/ ...) so it is not super surprising that they will share some similarity to each other.

ADD COMMENT
0
Entering edit mode

Thanks for your answer, it's more clear.

However, sorry but I am not familiar with this output but how did you calculate 1900 bp ?

I have looked at the sequences rnd-1_family-52#DNA/Maverick and rnd-1_family-50#Unknown generated by RepeatModeler and their size are 6975 bp and 1116 bp respectively.

ADD REPLY
0
Entering edit mode

yeah, my bad ... was looking at the wrong column, you're indeed correct in respect to their length

ADD REPLY

Login before adding your answer.

Traffic: 2693 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6