Question: What is the best tool to remove redundancy from annotated genome files?
gravatar for Monika
12 days ago by
University of applied science, Leiden, Netherlands
Monika10 wrote:


I am going to work with annotated wasp genome, but there is a lot of redundancy in the annotation. Meaning there is a lot of different hits for most of the predicted coding regions. The genome was annotated with Maker2. I would like to filter it and keep only the best quality matches. I was wondering if you could propose the best tool for it? Note that this is the first time assembled genome, there is no reference or any other research.

Thank you in advance :)

ADD COMMENTlink modified 10 days ago • written 12 days ago by Monika10

Does this signify that you did not remove sequence redundancy in the contigs what you had assembled before doing the annotation?

ADD REPLYlink modified 12 days ago • written 12 days ago by GenoMax42k

I haven't done the assembly by myself. I have a ready gff3 file to work with, but as long as I know, all duplicates were filtered and the assembly has been validated before the annotation.

ADD REPLYlink written 12 days ago by Monika10

You mean redundancy in the functional annotation part? Would it be possible to post a small extract of the gff3 file (indicating the redundancy)?

ADD REPLYlink written 11 days ago by lieven.sterck600

Yes, this is redundancy in functional annotation. I am sorry for not being clear.. Here is a screen of one CDS position in gff3

ADD REPLYlink modified 10 days ago • written 11 days ago by Monika10
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 730 users visited in the last hour