Question: editing gtf file
0
gravatar for dieunelderilus
9 weeks ago by
dieunelderilus10 wrote:

I have a gtf file as follow:

KB705106        VEuPathDB       exon    3645    3767    0       -       .       gene_id ""; transcript_id "AARA010197-RA";
KB705106        VEuPathDB       CDS     3645    3767    0       -       2       gene_id ""; transcript_id "AARA010197-RA";
KB705106        VEuPathDB       exon    3975    4065    0       -       .       gene_id ""; transcript_id "AARA010198-RA";

I want to copy the first 10 characters of the gene transcript id and paste it to the corresponding gene id as follow:

KB705106        VEuPathDB       exon    3645    3767    0       -       .       gene_id "AARA010197"; transcript_id "AARA010197-RA";
KB705106        VEuPathDB       CDS     3645    3767    0       -       2       gene_id "AARA010197"; transcript_id "AARA010197-RA";
KB705106        VEuPathDB       exon    3975    4065    0       -       .       gene_id "AARA010198"; transcript_id "AARA010198-RA";

Please, what is the easiest way to do this?

Thank you. ~DD

edit gee gtf • 146 views
ADD COMMENTlink modified 9 weeks ago by Jorge Amigo12k • written 9 weeks ago by dieunelderilus10

what is the easiest way to do this?

There are many different ways to parse and reformat text files. The easiest for you will depend on the scripting language you are most familiar with. For instance, I would personally use R (with the read.table(), sapply() and strsplit() functions), but there are also good options in python/perl, and the most efficient way would probably be in bash/awk. What do you prefer ?

ADD REPLYlink written 9 weeks ago by Carlo Yague5.7k
1
gravatar for Jorge Amigo
9 weeks ago by
Jorge Amigo12k
Santiago de Compostela, Spain
Jorge Amigo12k wrote:

Here is a perl one-liner that would do the job:

perl -pe 's/gene_id ""; transcript_id "([^"]{1,10})/gene_id "$1"; transcript_id "$1/' input.gtf > output.gtf

The pattern [^"]{1,10} matches the first 10 characters of transcript_id, even if its length is shorter.

ADD COMMENTlink modified 9 weeks ago • written 9 weeks ago by Jorge Amigo12k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2371 users visited in the last hour
_