Question: gff3 header delimiter space or a tab
0
gravatar for microfuge
3 months ago by
microfuge1.5k
microfuge1.5k wrote:

Dear All,

I could not find a source which states the field delimiter to be used in gff3 header. Can it be a space or a tab or it should be a space only ? My hunch is a space.

##gff-version 3 
##sequence-region 1 10

Many Thanks!

gff3 • 158 views
ADD COMMENTlink modified 3 months ago by Carambakaracho2.0k • written 3 months ago by microfuge1.5k
2

I don't think it even matters.

you could op en it in vi and then do :set list to show all 'invisible' chars ( ^I is tab )

ADD REPLYlink written 3 months ago by lieven.sterck7.0k

Thanks so much! This was a fake gff entry I created, just wanted to know if the official specification says something about it. Did not know about the set list option in vi (quite nice :) ).

ADD REPLYlink written 3 months ago by microfuge1.5k

Link to "official" (best I've found so far) specifications: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md

ADD REPLYlink modified 3 months ago • written 3 months ago by massa.kassa.sc3na230
1
gravatar for Carambakaracho
3 months ago by
Carambakaracho2.0k
Germany/Cologne
Carambakaracho2.0k wrote:

This is a great question, and one why I 'love' the gff format so much. It is not explicitly defined. Period. See definition of directives in gff3 format - implicitly the documentation uses spaces, just as ATpoint illustrated, so I recommend spaces, too. Tabs are usually used for separation of the feature lines.

ADD COMMENTlink written 3 months ago by Carambakaracho2.0k
2

+1 for the space.
As you can see in the snapshots of the different versions of the format I put in the review of the format here: https://github.com/NBISweden/GAAS/blob/master/annotation/CheatSheet/gxf.md they always have used a space.
Let's ask them to clarify it in the repo of the gff3 specification. ✅ => https://github.com/The-Sequence-Ontology/Specifications/issues/23

ADD REPLYlink modified 3 months ago • written 3 months ago by Juke-343.4k
0
gravatar for shoujun.gu
3 months ago by
shoujun.gu280
shoujun.gu280 wrote:

gff3 from gencode is tab.

edit: sorry, I didn't notice the post is talk about the header... Then it just regular sentences I think.

ADD COMMENTlink modified 3 months ago • written 3 months ago by shoujun.gu280

No, it isn't, it is space, at least in the mouse (v20) files I have on my machine.

gzcat gencode.vM20.annotation.gff3.gz | head
##gff-version 3
#description: evidence-based annotation of the mouse genome (GRCm38), version M20 (Ensembl 95)
#provider: GENCODE
#contact: gencode-help@ebi.ac.uk
#format: gff3
#date: 2018-11-30
##sequence-region chr1 1 195471971
ADD REPLYlink modified 3 months ago • written 3 months ago by ATpoint30k

Yes, I just realize the post is deal with the header only.

ADD REPLYlink modified 3 months ago • written 3 months ago by shoujun.gu280

I guess the header line is simply more or less non-standardized at all, but for the actual file, yes it is tab, like in most bioinformatics formats.

ADD REPLYlink written 3 months ago by ATpoint30k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 630 users visited in the last hour