gtf2bed doesn't give gene_id in the bed file
1
3
Entering edit mode
20 months ago
Biologist ▴ 290

I have a gtf sample.gtf like:

GL000009.2      ENSEMBL exon    56140   58376   .       -       .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; exon_number "1"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      ENSEMBL transcript      56140   58376   .       -       .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; oId "ENST00000618686.1"; tss_id "TSS35"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      Cufflinks       exon    59669   59932   .       +       .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; exon_number "1"; inf "unknown"; true_gene_id "XLOC_000023";
GL000009.2      Cufflinks       transcript      59669   61563   .       +       .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; oId "TCONS_00000027"; class_code "u"; tss_id "TSS25"; inf "unknown"; true_gene_id "XLOC_000023";

I converted gtf to bed using gtf2bed

gtf2bed < sample.gtf > sample.bed

And the bed file looks like:

GL000009.2      56139   58376   XLOC_000032     .       -       ENSEMBL exon    .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; exon_number "1"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      56139   58376   XLOC_000032     .       -       ENSEMBL transcript      .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; oId "ENST00000618686.1"; tss_id "TSS35"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      59668   59932   XLOC_000023     .       +       Cufflinks       exon    .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; exon_number "1"; inf "unknown"; true_gene_id "XLOC_000023";
GL000009.2      59668   61563   XLOC_000023     .       +       Cufflinks       transcript      .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; oId "TCONS_00000027"; class_code "u"; tss_id "TSS25"; inf "unknown"; true_gene_id "XLOC_000023";

Why is the 4th column in the bed file, not gene_id? It looks like it is taking true_gene_id. I want the output to be like below:

GL000009.2      56139   58376   ENSG00000278704.1     .       -       ENSEMBL exon    .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; exon_number "1"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      56139   58376   ENSG00000278704.1     .       -       ENSEMBL transcript      .       transcript_id "transc_00000026"; gene_id "ENSG00000278704.1"; gene_name "ENSG00000278704.1"; oId "ENST00000618686.1"; tss_id "TSS35"; inf "known"; true_gene_id "XLOC_000032";
GL000009.2      59668   59932   XLOC_000023     .       +       Cufflinks       exon    .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; exon_number "1"; inf "unknown"; true_gene_id "XLOC_000023";
GL000009.2      59668   61563   XLOC_000023     .       +       Cufflinks       transcript      .       transcript_id "transc_00000028"; gene_id "XLOC_000023"; gene_name "XLOC_000023"; oId "TCONS_00000027"; class_code "u"; tss_id "TSS25"; inf "unknown"; true_gene_id "XLOC_000023";

How to get the desired output?

rnaseq gtf bed • 812 views
ADD COMMENT
0
Entering edit mode

I got a bit frustrated with gtf2bed and started using GTFtools python package:

http://www.genemine.org/gtftools.php

ADD REPLY
0
Entering edit mode
20 months ago
ATpoint 81k

That's odd. Are you using the latest version? Try adding --attribute-key=gene_id.

ADD COMMENT
0
Entering edit mode

Do you mean like this?

gtf2bed --attribute-key=gene_id < sample.gtf > sample.bed

I'm using the version: 2.4.39

ADD REPLY

Login before adding your answer.

Traffic: 2539 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6