Maker Gff3 file issues
1
0
Entering edit mode
3.5 years ago
alslonik ▴ 220

Hi community,

This is really a technical question, I hope it is OK to post it here...

I am trying to import the gff3 file from Maker to my Jbrowse to view the annotations. I am using the maker2jbrowse script and getting constant errors. There are no indications that Maker did produce a problematic file, the logs are w/o errors. Still I am getting this output:

GFF3 parse error: some features reference other features that do not exist in the file (or in the same '###' scope).

Head of my gff3 file:

##gff-version 3
Chr6    .   contig  1   41368575    .   .   .   ID=Chr6;Name=Chr6
Chr6    maker   gene    9418414 9419484 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9;Name=maker-Chr6-exonerate_protein2genome-gene-94.9
Chr6    maker   mRNA    9418414 9419484 594 -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9;Name=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1;_AED=0.31;_eAED=0.43;_QI=0|0|0|1|0|0|2|0|197
Chr6    maker   exon    9418414 9418727 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:exon:382;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   exon    9419205 9419484 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:exon:381;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   CDS 9419205 9419484 .   -   0   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:cds;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   CDS 9418414 9418727 .   -   2   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:cds;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   gene    9469345 9471102 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.15;Name=maker-Chr6-exonerate_protein2genome-gene-94.15
Chr6    maker   mRNA    9469345 9471102 588 -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.15-mRNA-1;Parent=maker-Chr6-exonerate_protein2genome-gene-94.15;Name=maker-Chr6-exonerate_protein


The file is 2.1 Gb large. How do I check for the validity of the file and more importantly how do I fix the file in case it is not valid?

THANKS

genome browser genome annotation JBrowse • 3.6k views
0
Entering edit mode

Hello alslonik ,

I don't know Maker or worked with Jbrowse. But the error message is quite clear to me. In a ggf3 file the value given in Parent= link to an entry in the file where you have the same value in ID=. And this is not always the case in your file.

fin swimmer

0
Entering edit mode

Thanks, finswimmer, I understand what you mean. The question is how do I deal with this? Are there any ways to fix this in a gff3 file? Also, maybe it is a matter of sorting the file correctly? I have never worked with gff3 before, hence the questions...

0
Entering edit mode

Is your gff3 file the output of gff3_merge without filtering? It appears that you have filtered to keep only source (i.e., 2nd column) as maker. Perhaps you need to redo gff3_merge without filtering it's output to input into JBrowse, as some features of the gff3 file seem to be missing.

0
Entering edit mode

Not sure that I understand... Yes, I did:

gff3_merge -d logfile

I did not do any filtering while merging.

0
Entering edit mode

Hi all, I highly want to know how to deal with the error "GFF3 parse error: some features reference other features that do not exist in the file (or in the same '###' scope)". I also depressed by the same bug. Any solution? Thanks.

0
Entering edit mode

Hi, As people below answered me - you have to find the problem with the file, either using a script kindly provided below, or manually. I ended up opening the file with R tools and correcting it semi-manually.

1
Entering edit mode
3.5 years ago
Juke34 ★ 6.9k

I got often this kind of problem using MAKER on our cluster (LSF + openMPI). I never succeeded to find where the problem is coming from. I end up with some parent features missing or duplicated features. I have developed a library to standardise any kind of GTF/GFF that fix all kind of problem and produce a full gff3 output.

Clone this repository and install it: [https://github.com/NBISweden/GAAS][1] Then just do: gxf_to_gff3.pl --gff input.gff -o output.gff You can even add this option -v 1 to have a look at what problem is corrected.

You can access it within the toff toolkit AGAT:

agat_convert_sp_gxf2gxf.pl --gff input.gff -o output.gff


Have a look at the check steps to see what are the problems that have been corrected.

0
Entering edit mode

WOW. Thanks, Juke-34 I am going to try it. Al least I am not the only one!!! And I ran Maker with open MPI too... Thank you very much!

0
Entering edit mode

Hi, an update. Your script throws me a list of awful errors and no output:

error:
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: OBO File Format Error -
Cannot find tag format-version and/ default-namespace . These are required header.

STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:449
STACK: Bio::OntologyIO::obo::parse /usr/share/perl5/Bio/OntologyIO/obo.pm:208
STACK: BILS::Handler::GXFhandler::try {...}  /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:2892
STACK: Try::Tiny::try /usr/share/perl5/Try/Tiny.pm:92
STACK: BILS::Handler::GXFhandler::_handle_ontology /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:2897
STACK: BILS::Handler::GXFhandler::slurp_gff3_file_JD /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:102
STACK: gxf_to_gff3.pl:76
-----------------------------------------------------------

Let's continue without feature-ontology information.
No data retrieved among the feature-ontology.
=>GFF version parser used: 3

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: [  -   .   ID=Chr7:hsp:23258:3.10.8.137;Parent=Chr7:hit:14758:3.10.8.137;Target=gb|OWM91437.1| 350 575;Gap=M85 I4 M44 I4 M62 I4 M5 D7 M18] does not look like GFF3 to me
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:449
STACK: Bio::Tools::GFF::_from_gff3_string /usr/share/perl5/Bio/Tools/GFF.pm:616
STACK: Bio::Tools::GFF::from_gff_string /usr/share/perl5/Bio/Tools/GFF.pm:437
STACK: Bio::Tools::GFF::next_feature /usr/share/perl5/Bio/Tools/GFF.pm:395
STACK: BILS::Handler::GXFhandler::slurp_gff3_file_JD /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:201
STACK: gxf_to_gff3.pl:76


Does this mean that the file is corrupted? What do you think it means about my Maker run? Thanks again...

0
Entering edit mode

A line in your file is not 9 columns (i.e: - . ID=Chr7:hsp:23258:3.10.8.137;Parent=Chr7:hit:14758:3.10.8.137;Target=gb|OWM91437.1| 350 575;Gap=M85 I4 M44 I4 M62 I4 M5 D7 M18. You have to fix this line manually. The beginning of the line is probably just at the end of the previous line. I have already seen that, it’s rare but few times one line is split and written over 2 lines .