Programs to estimate divergence of RepeatMasker output
2
0
Entering edit mode
10.0 years ago
muppetleague ▴ 10

Hi all, I've been having trouble getting others' scripts to work with RepeatMasker output. The bundled RM script 'calcDivergenceFromAlign' gives an empty html file for RM's aln, and a garbled output for Censor's aln. REannotate looks very promising but hangs on my unmodified RM .out files for four separate genomes. This seems to be a bit of a frontier since google results are so hard to come by. Now that I've used Censor to get a higher resolution of family names, I would ideally use the .aln files, but these seem way different than what a traditional program is used to (ClustalW) unless I'm missing something obvious. Really I just want to be able to date the insertions relative to the sister species and couple the results with previously estimated speciation events.

REannotate hangs here.

*********************   REannotate
*
*
*     ...resolving structure of LTR elements...

Can't use an undefined value as an ARRAY reference at ./REannotate line 4847.
RepeatMasker REannotate • 3.0k views
ADD COMMENT
3
Entering edit mode
9.9 years ago
SES 8.6k

Based on your description, it seems that RepeatMasker did not execute successfully, so any program that uses the output of RepeatMasker is not going to work properly. You are correct about the alignments though, they are not Clustal format. We would need to see your RepeatMasker command and we need some information about your data if you want help troubleshooting the issue.

I have not used REannotate, but I did my dissertation work on TE dynamics and I've published work comparing insertion times/ages. I could definitely help with the general subject but it is difficult to help with these specific programs without more information. One thing to keep in mind is that RepeatMasker can take a very, very long time to run on a genome, so you may think the program "hangs" but it could be still running.

ADD COMMENT
0
Entering edit mode
9.9 years ago
muppetleague ▴ 10

I've gotten the RepeatMasker utility scripts to work nicely now, but I am unclear on the best way to infer a molecular clock from the Kimura 2-paramater divergence estimate. Any hint toward applying this to a more traditional phylogenetic approach (especially in light of the master gene copying hypothesis) would be greatly appreciated.

ADD COMMENT
0
Entering edit mode

You need to use PAML with the K2P model and a substitution rate in order to get the ML estimate of divergence. Using the correct substitution rate will be important for determining the insertion ages for your species.

By the way, it would be better if you made this a comment instead of an answer because it wasn't clear you were asking another question.

ADD REPLY

Login before adding your answer.

Traffic: 1848 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6