Is there any tool/API available to convert GTF/GFF to JSON format?
3
0
Entering edit mode
10 months ago

Hi all

Before I start writing my own code to convert GTF file into a JSON format file, has anyone came across any API or tool to either convert or download the file is JSON ?

I am looking for a format like this - gene -> transcript -> exon :

"MOS": {
    "NM_005372.1": [
        {
            "exon_number": "1",
            "start": 57025501,
            "end": 57026541
        },
        {
            "exon_number": "1",
            "start": 57025504,
            "end": 57026541
gtf gff json ncbi ensembl • 601 views
ADD COMMENT
1
Entering edit mode
10 months ago

I wrote a gtf2xml http://lindenb.github.io/jvarkit/Gtf2Xml.html

Xml can be converted to json with a xslt stylesheet and xsltproc( see https://github.com/lindenb/xslt-sandbox/blob/master/stylesheets/bio/ncbi/pubmed2json.xsl for an example).

ADD COMMENT
1
Entering edit mode
10 months ago

Perhaps this answer might be of general use:

"Is there a JSON-based genomic feature format?" https://bioinformatics.stackexchange.com/questions/10386/is-there-a-json-based-genomic-feature-format/10387#10387

Using an existing format with a stable schema may be a better approach, especially if you will share these files with others.

ADD COMMENT
1
Entering edit mode
10 months ago
vkkodali ★ 2.7k

NCBI Datasets produces data report in json format that may contain all of the information you seek. You can download the command line tool and try it out as follows:

datasets download gene gene-id 5768 --filename test.zip

One of the output files ncbi_datasets/data/data_report.jsonl has the following:

"transcripts": [
    {
    "accessionVersion": "NM_002826.5",
    "cds": {
        "accessionVersion": "NM_002826.5",
        "range": [
        {
            "begin": "40",
            "end": "2283"
        }
        ]
    },
    "ensemblTranscript": "ENST00000367602.8",
    "exons": {
        "accessionVersion": "NC_000001.11",
        "range": [
        {
            "begin": "180154869",
            "end": "180155172",
            "order": 1
        },
        {
            "begin": "180166491",
            "end": "180166591",
            "order": 2
        },

Not quite a tool to convert an existing GTF/GFF3 to json format, but if you are dealing with NCBI Gene annotation, this can be an option.

ADD COMMENT

Login before adding your answer.

Traffic: 2181 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6