Entering edit mode
4.9 years ago
Hyper_Odin
▴
320
I am trying to filter some of the noncoding genes from merged.gtf. Although i have extracted them, i am unable to convert them back to gtf format. here's what I am doing:
cat merged.gtf | fgrep -f non-coding-transcript-new.txt > non-coding-transcript.gtf
So basically, i want to match the transcript id's in txt to the id's in merged.gtf and write the final file in gtf format.
And here my non-coding-transcript-new txt file:
transcript_id "#ID transcript_length peptide_length Fickett_score pI ORF_integrity coding_probability label
transcript_id "TCONS_00000022 664 0 0.21719 0.0 -1 0.00838144 noncoding
transcript_id "TCONS_00016027 1639 155 0.31753 6.57525634765625 1 0.739683 coding
transcript_id "TCONS_00016038 2191 106 0.31657 9.64471435546875 1 0.134137 noncoding
transcript_id "TCONS_00016039 2770 106 0.26602000000000003 9.64471435546875 1 0.0926623 noncoding
And merged.gtf:
chr1 chr1 Cufflinks exon 11869 12227 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000003"; exon_number "1"; gene_name "DDX11L1"; oId "ENST000004563
28.2"; nearest_ref "ENST00000456328.2"; class_code "="; tss_id "TSS1";
chr1 Cufflinks exon 12613 12721 . + . gene_id "XLOC_000001"; transcript_id "TCONS_00000003"; exon_number "2"; gene_name "DDX11L1"; oId "ENST00000456328.2"; n