Question: Differences Between Ucsc And Local Repeatmasker Results
1
gravatar for AndreiR
7.6 years ago by
AndreiR260
São Paulo
AndreiR260 wrote:

Hello,

Im trying to reproduce some results observed at UCSC genome browser on repeats. I was careful about versions of RepeatMasker and RepBase. I understand that UCSC runs RepeatMasker with -s. However when I retrieve DNA sequence from UCSC window and run RepeatMasker on it, I cant find same events that are showed. I was wondering if the RepeatMasker engine may be an issue. Im using NCBI/RMBLAST [ 2.2.27+ ].

Thanks in advance

repeatmasker ucsc • 2.1k views
ADD COMMENTlink modified 6.8 years ago by Biostar ♦♦ 20 • written 7.6 years ago by AndreiR260
1

It is very difficult to duplicate RepeatMasker results between runs. Independent runs produce similar results, but rarely are they identical. Are you using the standard RepeatMasker libraries or are you using sequences from RepBase?

ADD REPLYlink modified 7.6 years ago • written 7.6 years ago by neal.platt230

Im using the latest version of RepeatMasker and latest version of RepBase. Actually Ive tried to use the versions of RM and RepBase stated on GoldenPath to gets closer but no success.

ADD REPLYlink written 7.6 years ago by AndreiR260
1

Is RM failing to recovering the expected repeats, or are they being misidentified. For example, are you expecting a repeat to be an ALU SINE (based on the UCSC annotation), but RM calls it a MIR, or are your RM runs not annotating the region at all?

ADD REPLYlink written 7.6 years ago by neal.platt230

The first, Im expecting something and get another.

ADD REPLYlink written 7.6 years ago by AndreiR260

Are you supplying the library using the '-lib' option or are you using the '-species' option. Do you mind sharing the annotation you are expecting versus what RM is giving you (just the repeat type)? If your annotation differs at or below the subfamily level, there may not be much you can do about it. It could be that RM is unable to determine if the repeats in question come from subfamily A vs subfamily B.

Have you tried looking at the alignment manually. It may be worth comparing one of your misidentified repeats with what UCSC has vs. your RM run. That may give you some sort of clarification.

ADD REPLYlink written 7.6 years ago by neal.platt230

Thanks for your ideas. Im supplying the library. In some examples, I found the use of qq and -s change the results. Im working on it to see if that justify all my problems. :)

ADD REPLYlink written 7.5 years ago by AndreiR260

Where did you get this information?

ADD REPLYlink written 6.8 years ago by PoGibas4.8k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2172 users visited in the last hour