I plan to do comparison analysis on repeat elements among several rodents. For some rodents (e.g., mouse, rat) their annotation information are directly available from RepeatMasker website (e.g., for mouse http://www.repeatmasker.org/species/musMus.html). But for the other rodents we can not get such annotation and I have to run repeatmasker by myself. I have some concerns about how to do the comparison and have two backup choices. Does anyone can give me some suggestions on that? Thanks
Choice 1: To maker sure all the annotations are achieved under the same condition (e.g., the repeat library, the parameters), I can do all the repeat annotation by myself and then do comparison.
Choice 2: For mouse and rat I use the annotation file from RepeatMasker website because such annotation should be standard. For other rodents I will try to annotate them as good as possible (for example, predict species-specific repeat elements). And then compare.
Which one do you think make more sense?
By the way, is there some package (codes) that can help to analyze the annotation file (see example below), and get the percentage of each kind of repeat element in the genome?
more mm10.fa.out SW perc perc perc query position in query matching repeat position in repeat score div. del. ins. sequence begin end (left) repeat class/family begin end (left) ID 14737 8.1 1.0 0.2 chr1 3000001 3000097 (192471874) C L1MdFanc_I LINE/L1 (2987) 3586 3489 1 27 0.0 0.0 0.0 chr1 3000098 3000123 (192471848) + (T)n Simple_repeat 1 26 (0) 2 14737 8.1 1.0 0.2 chr1 3000124 3002128 (192469843) C L1MdFanc_I LINE/L1 (3085) 3488 1467 1