Question

GFF3 files

0

Entering edit mode

22 months ago

Percy • 0

How could I create a gff file with the "Name" attribute? I have tried with both prodigal and prokka however, the gff files produced lack the name attribute which i need for a following analysis.

linux • 1.6k views

ADD COMMENT • link updated 22 months ago by lieven.sterck 15k • written 22 months ago by Percy • 0

0

Entering edit mode

context is missing.

ADD REPLY • link 22 months ago by Pierre Lindenbaum 161k

0

Entering edit mode

for instance, I annotated my MAGs using Prokka and Prodigal respectively. The gff file that I obtain afterwards lack the attribute "Name" eg. prodigal:

##gff-version  3
# Sequence Data: seqnum=1;seqlen=59792;seqhdr="NODE_23_length_59792_cov_23.204747"
# Model Data: version=Prodigal.v2.6.3;run_type=Metagenomic;model="39|Rickettsia_conorii_Malish_7|B|32.4|11|1";gc_cont=32.40;transl_table=11;uses_sd=1
NODE_23_length_59792_cov_23.204747  Prodigal_v2.6.3 CDS 1   147 19.5    -   0   ID=1_1;partial=10;start_type=TTG;rbs_motif=None;rbs_spacer=None;gc_cont=0.299;conf=98.71;score=18.89;cscore=30.86;sscore=-11.98;rscore=-0.99;uscore=-0.73;tscore=-9.61;
NODE_23_length_59792_cov_23.204747  Prodigal_v2.6.3 CDS 523 1983    198.6   -   0   ID=1_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.300;conf=99.99;score=198.00;cscore=196.39;sscore=1.61;rscore=-0.99;uscore=0.35;tscore=2.90;

prokka:

##gff-version 3
##sequence-region NODE_23_length_59792_cov_23.204747 1 59792
##sequence-region NODE_111_length_30229_cov_21.362365 1 30229
##sequence-region NODE_186_length_24472_cov_19.556948 1 24472
##sequence-region NODE_198_length_23498_cov_18.236489 1 23498
##sequence-region NODE_240_length_21525_cov_17.369632 1 21525

I need to get a gff file with the following attribute including "Name" eg.

LT795054.1      EMBL    CDS     2243    2450    .       -       0       ID=cds-SJX60001.1;Parent=gene-SRS1_00846;Dbxref=NCBI_GP:SJX60001.1;Name=SJX60001.1;gbkey=CDS;locus_tag=SRS1_00846;product=uncharacterized protein;protein_id=SJX60001.1

(...)

ADD REPLY • link updated 22 months ago by lieven.sterck 15k • written 22 months ago by Percy • 0

1

Entering edit mode

sigh... This comment is not an answer, you'd better add it to your original post. And add some formatting, for example enclose the gff sections with code blocks (the 101010 icon), have each line on their own line, etc.

Getting the 'name' attribute is a data analysts job. Prodigal will give you the gene predictions, you'll need to match those with functional annotations.

ADD REPLY • link 22 months ago by Carambakaracho ★ 3.2k

score 1 · Answer 1 · 2022-06-07

1

Entering edit mode

22 months ago

iraun 6.2k

You can use awk. Using this one-liner, the content of Dbbxref is copied and added to a new attribute "Name".

awk -F'\t' '{split($9,a,";");split(a[3],b,":"); newname=b[2]; print $0";NAME="newname}' your.gff3

This one-liner assumes that Dbbxref is the 3rd attribute in the 9th column.

ADD COMMENT • link 22 months ago by iraun 6.2k

0

Entering edit mode

If OP had formatted their added info in a more readable way, you'd probably have seen that the Dbxref is not an attribute provided by the orf caller:

NODE_23_length_59792_cov_23.204747  Prodigal_v2.6.3 CDS 1    147   19.5   -  0 ID=1_1;partial=10;start_type=TTG;rbs_motif=None;rbs_spacer=None;gc_cont=0.299;conf=98.71;score=18.89;cscore=30.86;sscore=-11.98;rscore=-0.99;uscore=-0.73;tscore=-9.61; 
NODE_23_length_59792_cov_23.204747  Prodigal_v2.6.3 CDS 523  1983  198.6  -  0 ID=1_2;partial=00;start_type=ATG;rbs_motif=None;rbs_spacer=None;gc_cont=0.300;conf=99.99;score=198.00;cscore=196.39;sscore=1.61;rscore=-0.99;uscore=0.35;tscore=2.90;

ADD REPLY • link 22 months ago by Carambakaracho ★ 3.2k

0

Entering edit mode

Definitely, that is very true. I hope he can play with the command though and adapt it to his needs, it is quite straightforward. For example, to copy the content of ID attribute, just change the code to the following:

awk -F'\t' '{split($9,a,";");split(a[1],b,"="); newname=b[2]; print $0";NAME="newname}' your.gff3

ADD REPLY • link 22 months ago by iraun 6.2k

score 1 · Answer 2 · 2022-06-07

1

Entering edit mode

22 months ago

Juke34 8.5k

You can play with agat_sq_manage_attributes.pl AGAT and create a Name attribute from the ID.

ADD COMMENT • link 22 months ago by Juke34 8.5k