I was considering using the GFF file produced by arrow to transfer annotations before and after polishing.
However, I noticed that this file does not contain a full list of changes made by polishing. This means the strategy above can have problems.
For example, here is an Arrow polishing GFF file:
[seq_name] . insertion 34602 34602 . . . reference=.;variantSeq=GGGGGGTATCT;coverage=100;confidence=93 [seq_name] . insertion 34753 34753 . . . reference=.;variantSeq=C;coverage=100;confidence=93
And here are the differences found from a BWA-MEM alignment and .pileup file creation (although I removed some columns and the unchanged rows below):
prev_assembly pos pref_ref new_ref [seq_name] 15648 T .+1C [seq_name] 28683 T .+1G [seq_name] 31652 A .-2CC [seq_name] 32992 C .-1A [seq_name] 34602 A .+11GGGGGGTATCT [seq_name] 34753 G .+1C [seq_name] 101768 A .+1C
The 2 changes listed in the GFF file were in fact made (at positions 34602 and 34753 in the .pileup file). However, the polishing also made additional changes.
Is this something that everybody is familiar with? Am I correct that there is no way to list all polishing changes made in the GFF file from Arrow?
I am using Arrow version 2.3.3, as loaded by pbbioconda.