Question: Merging 2 GFF files
10 weeks ago by
bwczech60 wrote:


I have combined two gff files for my annotation. One GFF represents chromosomes 1-29 + X and second GFF represents Y chromosome. The same situation is with genome (I have combined two genomes for my purposes).

The problem is with ID and Parent field in GFF because they are overlapping. I mean Y chromosome has the same ID as 29 chromosome:

Y   Gnomon  gene    2502499 2571410 .   +   .   ID=gene32386;Dbxref=GeneID:100849399;Name=LOC100849399;gbkey=Gene;gene=LOC100849399;gene_biotype=protein_coding

and the same ID (gene32386) here:

29  Gnomon  pseudogene  37912602    37922321    .   -   .   ID=gene32386;Dbxref=GeneID:615840;Name=LOC615840;gbkey=Gene;gene=LOC615840;gene_biotype=pseudogene;pseudo=true

How can I fix that problem? Because of that situation I cannot do annotation of my Y chromosome. Should I modify ID and Parent field in my GFF or what?

written 10 weeks ago by bwczech60

Should I modify ID

That would work, I suppose.

written 10 weeks ago by WouterDeCoster36k

But the ID modification will not impply for results of snpEff annotation? How should I modify ID? Changing the NUMBER or I can just add extra char?

written 10 weeks ago by bwczech60

Adding extra character is enough and probably the easiest. If you have already done downstream analysis with snpEff I don't know what it might imply. If it was using the IDs I guess it would have complain to meet duplicate IDs.

written 10 weeks ago by Juke-341.8k
