Question: Annotating an assembled GTF file
0
gravatar for nattzy94
6 months ago by
nattzy9420
nattzy9420 wrote:

I am assembling a gtf file from a bam file which I generated by aligning my rnaseq reads using STAR. Assembly was done using StringTie and the Ensembl annotation file for GRCh38.

My problem is that the resulting gtf file does not contain all the information that is in the reference annotation. Crucially, it is missing information on transcript biotype which I am interested in.

For instance the reference annotation has the following fields for a transcript:

 1       havana  exon    12975   13052   .       +       .       gene_id "ENSG00000223972"; gene_version "5"; transcript_id "ENST00000450305"; transcript_version "2"; exon_number "4"; gene_name "DDX11L1"; gene_source "havana"; gene_biotype "transcribed_unprocessed_pseudogene"; transcript_name "DDX11L1-201"; transcript_source "havana"; transcript_biotype "transcribed_unprocessed_pseudogene"; exon_id "ENSE00001799933"; exon_version "2"; tag "basic"; transcript_support_level "NA";

However, my assembled gtf file looks like this:

1       StringTie       exon    12613   12721   1000    +       .       gene_id "MSTRG.1"; transcript_id "ENST00000456328"; exon_number "2"; gene_name "DDX11L1"; ref_gene_id "ENSG00000223972";

I've also tried searching the entire file for "transcript_biotype" but nothing comes up.

From this previous post, I saw that a potential fix might be to convert the gtf to bed12 and then annotate the bed12 using the Ensembl annotation file. However, I'm not sure exactly which bedtools function to use.

Would be great if anyone could point to a different solution.

rna-seq assembly • 252 views
ADD COMMENTlink modified 6 months ago by PeiwenLi0 • written 6 months ago by nattzy9420

Hey, same question here. Have you solve it?

ADD REPLYlink written 29 days ago by JRS0
0
gravatar for PeiwenLi
6 months ago by
PeiwenLi0
Canada
PeiwenLi0 wrote:

Hi! I am trying to do the exact same task as you and I found this post: Gene feature information missing in Stringtie merged assembly. May be helpful!

ADD COMMENTlink written 6 months ago by PeiwenLi0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1230 users visited in the last hour