Unable to convert txt to gtf format
1
0
Entering edit mode
4.5 years ago
Hyper_Odin ▴ 310

I am trying to filter some of the noncoding genes from merged.gtf. Although i have extracted them, i am unable to convert them back to gtf format. here's what I am doing:

cat merged.gtf | fgrep -f non-coding-transcript-new.txt > non-coding-transcript.gtf

So basically, i want to match the transcript id's in txt to the id's in merged.gtf and write the final file in gtf format.

And here my non-coding-transcript-new txt file:

transcript_id "#ID transcript_length peptide_length Fickett_score pI ORF_integrity coding_probability label

transcript_id "TCONS_00000022   664     0       0.21719 0.0     -1      0.00838144      noncoding
transcript_id "TCONS_00016027   1639    155     0.31753 6.57525634765625        1       0.739683        coding
transcript_id "TCONS_00016038   2191    106     0.31657 9.64471435546875        1       0.134137        noncoding
transcript_id "TCONS_00016039   2770    106     0.26602000000000003     9.64471435546875        1       0.0926623       noncoding

And merged.gtf:

chr1    chr1    Cufflinks       exon    11869   12227   .       +       .       gene_id "XLOC_000001"; transcript_id "TCONS_00000003"; exon_number "1"; gene_name "DDX11L1"; oId "ENST000004563
28.2"; nearest_ref "ENST00000456328.2"; class_code "="; tss_id "TSS1";
chr1    Cufflinks       exon    12613   12721   .       +       .       gene_id "XLOC_000001"; transcript_id "TCONS_00000003"; exon_number "2"; gene_name "DDX11L1"; oId "ENST00000456328.2"; n
next-gen python • 1.0k views
ADD COMMENT
0
Entering edit mode
4.5 years ago
Hyper_Odin ▴ 310

I figured out the problem. the pattern in txt file:

"TCONS_00000022

is not matching up with:

"TCONS_00000003"

in gtf file.

ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6