Annotating an assembled GTF file
1
0
Entering edit mode
4.0 years ago
nattzy94 ▴ 50

I am assembling a gtf file from a bam file which I generated by aligning my rnaseq reads using STAR. Assembly was done using StringTie and the Ensembl annotation file for GRCh38.

My problem is that the resulting gtf file does not contain all the information that is in the reference annotation. Crucially, it is missing information on transcript biotype which I am interested in.

For instance the reference annotation has the following fields for a transcript:

 1       havana  exon    12975   13052   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000450305"; transcript_version "2"; exon_number "4"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-201"; transcript_source "havana"; transcript_biotype "transcribed_unprocessed_pseudogene"; exon_id "ENSE00001799933"; exon_version "2"; tag "basic"; transcript_support_level "NA";

However, my assembled gtf file looks like this:

1       StringTie       exon    12613   12721   1000    +       .       gene_id "MSTRG.1"; transcript_id "ENST00000456328"; exon_number "2"; gene_name "DDX11L1"; ref_gene_id "ENSG00000223972";

I've also tried searching the entire file for "transcript_biotype" but nothing comes up.

From this previous post, I saw that a potential fix might be to convert the gtf to bed12 and then annotate the bed12 using the Ensembl annotation file. However, I'm not sure exactly which bedtools function to use.

Would be great if anyone could point to a different solution.

RNA-Seq Assembly • 1.2k views
ADD COMMENT
0
Entering edit mode

Hey, same question here. Have you solve it?

ADD REPLY
0
Entering edit mode
4.0 years ago
PeiwenLi • 0

Hi! I am trying to do the exact same task as you and I found this post: Gene feature information missing in Stringtie merged assembly. May be helpful!

ADD COMMENT

Login before adding your answer.

Traffic: 2233 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6