Hi all,
I put a FASTA file that had roughly 65,000 different scaffolds into RepeatMasker. I want to parse the output file to do the following:
1.) Tell me the total length of a certain type of repeat throughout the entire FASTA. 2.) Tell me the length, percentage, and times of appearance of a certain type of repeat in each of the individual scaffolds.
I was planning on creating some custom scripts. But I feel like there has to be some type of package or tool out there that can do something like this already. Any help would be appreciated!
BTW, I'm already familiar with this: https://github.com/4ureliek/Parsing-RepeatMasker-Outputs Seems to be useful for TEs found through RepeatMasker, but not exactly what I need.
Maybe you can check : http://doua.prabi.fr/software/one-code-to-find-them-all