Question: gff3 header delimiter space or a tab
1
gravatar for microfuge
12 months ago by
microfuge1.9k
microfuge1.9k wrote:

Dear All,

I could not find a source which states the field delimiter to be used in gff3 header. Can it be a space or a tab or it should be a space only ? My hunch is a space.

##gff-version 3 
##sequence-region 1 10

Many Thanks!

gff3 • 292 views
ADD COMMENTlink modified 12 months ago by Carambakaracho2.2k • written 12 months ago by microfuge1.9k
2

I don't think it even matters.

you could op en it in vi and then do :set list to show all 'invisible' chars ( ^I is tab )

ADD REPLYlink written 12 months ago by lieven.sterck8.9k

Thanks so much! This was a fake gff entry I created, just wanted to know if the official specification says something about it. Did not know about the set list option in vi (quite nice :) ).

ADD REPLYlink written 12 months ago by microfuge1.9k

Link to "official" (best I've found so far) specifications: https://github.com/The-Sequence-Ontology/Specifications/blob/master/gff3.md

ADD REPLYlink modified 12 months ago • written 12 months ago by massa.kassa.sc3na340
1
gravatar for Carambakaracho
12 months ago by
Carambakaracho2.2k
Germany/Cologne
Carambakaracho2.2k wrote:

This is a great question, and one why I 'love' the gff format so much. It is not explicitly defined. Period. See definition of directives in gff3 format - implicitly the documentation uses spaces, just as ATpoint illustrated, so I recommend spaces, too. Tabs are usually used for separation of the feature lines.

ADD COMMENTlink written 12 months ago by Carambakaracho2.2k
2

+1 for the space.
As you can see in the snapshots of the different versions of the format I put in the review of the format here: https://github.com/NBISweden/GAAS/blob/master/annotation/CheatSheet/gxf.md they always have used a space.
Let's ask them to clarify it in the repo of the gff3 specification. ✅ => https://github.com/The-Sequence-Ontology/Specifications/issues/23

ADD REPLYlink modified 12 months ago • written 12 months ago by Juke344.9k
0
gravatar for shoujun.gu
12 months ago by
shoujun.gu310
shoujun.gu310 wrote:

gff3 from gencode is tab.

edit: sorry, I didn't notice the post is talk about the header... Then it just regular sentences I think.

ADD COMMENTlink modified 12 months ago • written 12 months ago by shoujun.gu310

No, it isn't, it is space, at least in the mouse (v20) files I have on my machine.

gzcat gencode.vM20.annotation.gff3.gz | head
##gff-version 3
#description: evidence-based annotation of the mouse genome (GRCm38), version M20 (Ensembl 95)
#provider: GENCODE
#contact: gencode-help@ebi.ac.uk
#format: gff3
#date: 2018-11-30
##sequence-region chr1 1 195471971
ADD REPLYlink modified 12 months ago • written 12 months ago by ATpoint41k

Yes, I just realize the post is deal with the header only.

ADD REPLYlink modified 12 months ago • written 12 months ago by shoujun.gu310

I guess the header line is simply more or less non-standardized at all, but for the actual file, yes it is tab, like in most bioinformatics formats.

ADD REPLYlink written 12 months ago by ATpoint41k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1558 users visited in the last hour