RepeatMasker: alignment sometimes doesn't fit
1
0
Entering edit mode
4.2 years ago

If I compare alignment from RepeatMasker (cross_match engine, dfam library) with ssearch (fasta36) I get a different result.

  • ssearch v36.3.8g
  • repeatmasker v4.1.0
  • cross_match v1.090518
  • dfam v3.1

My questions is: Why the results are so different? Did I misunderstood something or its because of performance, or ssearch is bad for this, .....

RepeatMasker

   SW   perc perc perc  position in query  matching   position in repeat
score   div. del. ins.  begin     end      repeat     begin   end  (left)

 2373   25.9  9.0  3.7  38256     39464 +  MLT1E1A    119    1371     (0)
  917   27.4  9.2  6.4  39465     39623 +  MLT1E1A      1     172   (388)
 2138   12.0  0.0  0.0  39624     39924 +  AluSx        1     301    (11)
  923   26.0 12.8  3.8  39925     40294 +  MLT1E1A    173     666    (14)

MLT1E1A sequence

>MLT1E1A 680 BP; 181 A; 162 C; 189 G; 143 T; 5 other;
TGTGGTAGGCAGAATTCTAAGATGGCCCCCAAGATTCCCGCCCCCTGGTGTACACGCCCT
GTATAATCCCCTCCCCTTGAGTGTGGGCGGGACCTGTGAATATGATGGGATATCACTCCC
GTGATTAGGTTACATTATATGGCAAAGGTGAAGGGATTTTGCAGATGTAATTAAGGTCCC
TAATCAGTTGACTTTGAGTTAATCAAAAGGGAGATTATCCTGGGTGGGCCTGACCTAATC
AGGTGAGCCCTTAAAAGAGGCATGGGCCCTCCAGAGAGAAGAACAGAGAGATTCTCCTGC
TGGCCTTGAAGAAGCAAGCTGCCATGTTGTGAGAGGGCCTATGGAGAGGGCCACGTGGCA
AGGANCTGNGGGCGGCCTCTAGGAGCTGAGAGCGGCCCCCGGCCGACAGCCAGCAAGAAA
ACGGGGACCTCAGTCCTACAGCCGCAAGGAANTGAATTCTGCCAACAACCTGAATGAGCT
TGGAAGNGGACCCTAAGCCTCAGATGAGAACGCAGCCCCGGCCGACACCTTGATTNCAGC
CTTGTGAGACCCTGAGCAGAGGACCCAGCTAAGCCGTGCCCGGACTCCTGACCCACAGAA
ACTGTGAGATAATAAATGTGTGTTGTTTTAAGCCGCTAAGTTTGTGGTAATTTGTTACGC
AGCAATAGAAAACTAATACA

First aligment

Based on the first row from RM, I took sequence on position 38256-39464 and made an alignment with ssearch.

I've expected that alignment will be over the entire sequence (and MLT1E1A within 119-1371).

Ssearch gave me:

(sequence doesn't start from 1 and MLT1E1A doesn't start from 119)

        60        70        80        90       100       110
MLT1E1 CCTGTATAATCCCCTCCCCTTGAGTGTGGGCGGGACCTGTGAATATGATGGGATATCACT
                                     :::::: :::::  :::: ::::: :::
chromo TCTACACCTAAACCCGATTTAGATGAGATTCGGGAC-TGTGAGCATGAAGGGATCTCAAG
           1100      1110      1120       1130      1140      1150

       120        130       140       150       160       170
MLT1E1 CCCG-TGATTAGGTTACATTATATGGCAAAGGTGAAGGGATTTTGCAGATGTAATTAAGG
          : ::: :  :::   ::  :::   ::::   :::  : :::  ::
chromo AGGGGTGAATGTGTT---TTGCATGCACAAGGGACAGGAGTCTTGGGGACAGAGGACAGG
            1160         1170      1180      1190      1200

Second aligment

Its better (sequence starts from 1 but its not over the entire sequence)

               10        20        30        40        50        60
MLT1E1 TGTGGTAGGCAGAATTCTAAGATGGCCCCCAAGATTCCCGCCCCCTGGTGTACACGCCCT
       :::::: :::::: : ::::: :: ::::::  :  :::  :: :::   : ::: ::::
chromo TGTGGT-GGCAGA-TACTAAGGTGACCCCCAC-AACCCCCACCTCTGCCATTCACACCCT
                10         20        30         40        50

                70        80        90       100       110
MLT1E1 -GTATAATCCCCTCCCCTTGAGTGTGGGCGGGACCTGTGAATATGATGGGATATCACTCC
        : :::::::::: : :: :  :::  :: : :::::::
chromo TGAATAATCCCCTTCTCTGGT-TGTAAGCAGAACCTGTGGCTTGCTTATGAAGGAGGCGG
        60        70         80        90       100       110
repeatmasker alignment repeats ssearch cross_match • 804 views
ADD COMMENT
0
Entering edit mode
4.2 years ago
tothepoint ▴ 800

Please mention your command for running RepeatMasker.

ADD COMMENT
0
Entering edit mode
./RepeatMasker -par 8 results/chr1.fa
ADD REPLY
0
Entering edit mode

Try this and then revert if not works. ./RepeatMasker -pa 8 -spec "species_name" -dir temp_species results/chr1.fa

ADD REPLY

Login before adding your answer.

Traffic: 1789 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6