Question: Merge two genome annotation files: one annotated by contig, the other annotated by scaffold.
Hi @ll!

After doing what was recommended to me in my previous post ( Match predicted sequences to reference genome to generate data for annotation GTF ), I ended up with two annotation files. EVM produced an annotation on a scaffold level ("scaffold1234"), BLAT produced an annotation on the contig level ("RDRX12302"). Now I am wondering how do I merge the information of these two GTF files? I do know which contig is part of which scaffold (the reference genome lists them as: ">RDRX12302 isolate A scaffold1234, whole genome shotgun sequence", but I imagine that the position numbers put out by BLAT are sensitive to contig identity and cannot just copied over to the respective scaffold names.

I would be grateful if you could point me into the correct direction.

Thank you! Joe

annotation • 47 views
As far as I see it, this will be a two step procedure.

step 1 : 'move' the annotations from the contig level to the scaffolds. Tools that come to mind here are for instance lift-over (from the AllMaps package) or lift-off (recently published by salzberg)

step 2: integrate/merge all annotations from the scaffold level. Not 100% sure but I think there is likely a tool in the AGAT suite that can do this.

