change the attribute order on a gtf file
1
0
Entering edit mode
4 months ago
bright602 ▴ 50

Hi, I have a gtf file. Could someone tell me how to change the order of attributes on the 9th column (gene_id comes first, then transcript_id, and lastly gene_name)?

Currently the format is below:

DDB0169550  dictyBase Curator   exon    11729   12478   .   +   .   transcript_id "DDB0201587"; gene_id "DDB_G0294088"; gene_name "A";
DDB0169550  dictyBase Curator   exon    13479   13862   .   +   .   transcript_id "DDB0201587"; gene_id "DDB_G0294088"; gene_name "B";
DDB0169550  dictyBase Curator   CDS 6411    6676    .   +   0   transcript_id "DDB0201587"; gene_id "DDB_G0294088"; gene_name "C";

Thanks for your help!

file gtf • 349 views
ADD COMMENT
0
Entering edit mode

May I ask why you want to change the order of the attributes? In theory it should not matter.

EDIT: Sorry I'm wrong, in GTF there are two mandatory attributes gene_id and transcript_id, any other attributes or comments must appear after these two. So yes order matter. (More details here: https://agat.readthedocs.io/en/latest/gxf.html)

ADD REPLY
1
Entering edit mode
4 months ago

do not post the images of the data.

$ sed -r '/^#/! s/(transcript_id ".*"); (gene_id ".*"); (gene_name ".*");/\2; \1; \3/' test.gtf

DDB0169550  dictyBase   Curator exon    11729   12478   .      .   gene_id "DDB_G0294088"; transcript_id "DDB0201587"; gene_name "A"
DDB0169550  dictyBase   Curator exon    13479   13862   .      .   gene_id "DDB_G0294088"; transcript_id "DDB0201587"; gene_name "B"
DDB0169550  dictyBase   Curator CDS 6411    6676    .      0   gene_id "DDB_G0294088"; transcript_id "DDB0201587"; gene_name "C"
ADD COMMENT
0
Entering edit mode

Thank You!

ADD REPLY

Login before adding your answer.

Traffic: 1675 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6