Question

Snp Format For Gff3: 'Name=' Vs 'Id='

0

Entering edit mode

11.5 years ago

Pavel Senin ★ 1.9k

Hi folks: I am trying to get SNPs displayed in IGV track by loading them from GFF file, when I use 'Name=...' for naming of SNP, it is not displayed, with 'ID=...' it's visible, so my question is what is the proper SNP annotation format for GFF3?

(For example GMOD manual says: Name - Display name for the feature. This is the name to be displayed to the user. Unlike IDs, there is no requirement that the Name be unique within the file... So I assume that 'Name=...' must be OK for IGV too?)

snp gff3 • 3.1k views

ADD COMMENT • link updated 11.5 years ago by cain.cshl ▴ 70 • written 11.5 years ago by Pavel Senin ★ 1.9k

score 1 · Answer 1 · 2012-11-23

1

Entering edit mode

11.5 years ago

cain.cshl ▴ 70

Well, it's flexible to the extent that it's not XML, but the standard is pretty clear in this regard. While you may need ID and Name to make IGV happy, you should only need Name, since ID is not intended to convey any meaning outside of the GFF file. You should complain to the authors of IGV that this is a bug in their GFF handling.

ADD COMMENT • link 11.5 years ago by cain.cshl ▴ 70

0

Entering edit mode

I would disagree about ID field role, since it is unique - it provides a lot of convenience for "grepping" only the thing one needs. It might be the case for IGV too - they may index features by ID.

ADD REPLY • link 11.5 years ago by Pavel Senin ★ 1.9k

0

Entering edit mode

But it is only guaranteed unique within a given GFF file; if you want to "overload" the ID for your GFF files, that's fine, but understand that is not what the spec says it is for, so general use software that is written to work with GFF shouldn't be using it in that way. Of course, software would probably want to "keep track" of IDs (since they indicate parent child relationships), but they shouldn't be using them for the name of the feature (there's already a tag for that), an alias (there's a tag for that too) or for that matter, for a database accession (tag for that).

ADD REPLY • link 11.5 years ago by cain.cshl ▴ 70

0

Entering edit mode

i'm not sure about "overload" meaning, nevertheless, it's sufficient for single GFF to have unique IDs to make life a lot easier - it is possible to do a search or to encode some sort of references uniquely. The unique property which is guaranteed by GFF implementation is the same as unique key within the relational database table, this is how I see it. One could also leverage by extending this single-file uniqueness even further - by requiring unique IDs in the lab etc. So, there is a rationale to rely on IDs within the viewer (IGV in this case). What I'd like to see, is that Name tag would make IGV to show SNPs too...

ADD REPLY • link 11.5 years ago by Pavel Senin ★ 1.9k

score 0 · Answer 2 · 2012-11-23

0

Entering edit mode

11.5 years ago

Neilfws 49k

I don't think there is a "proper" format. GFF3 is quite flexible; whilst it defines keys such as "Name" and "Id", their presence and values are entirely at the discretion of whoever creates the GFF3 file. Similarly, different softwares will parse and display GFF3 in different ways, according to their internal rules.

So if "Id=" is what works for you, then go with that.

ADD COMMENT • link 11.5 years ago by Neilfws 49k

0

Entering edit mode

it is well understood for me - whatever works; however, to make my file "portable" - so others can use it, it would be nice to know

ADD REPLY • link 11.5 years ago by Pavel Senin ★ 1.9k

1

Entering edit mode

Well as I said, different software interprets GFF3 differently so it cannot be portable in the sense of displaying the same for everyone. I would include both Id and Name in the attributes.