Question: Stringtie output gtf file only contains STRG,no original annotations
0
gravatar for yiren
7 months ago by
yiren0
yiren0 wrote:

Hi,

I am trying to generate novel genes transcripts expression following the stringtie manual. The stringtie --merge mode takes as input a list of all the assembled transcripts files (in GTF format) previously obtained for each sample, as well as a reference annotation file (-G option).The merged.gtf file geneid is STRG flag,but not the reference geneid,such as below.

chr1A_  StringTie   transcript  4059    4397    1000    +   .   gene_id "MSTRG.3"; transcript_id "C1_00010W_A-T"; ref_gene_id "C1_00010W_A"; 
chr1A   StringTie   exon    4059    4397    1000    +   .   gene_id "MSTRG.3"; transcript_id "C1_00010W_A-T"; exon_number "1"; ref_gene_id "C1_00010W_A"; 
chr1A   StringTie   transcript  4409    5266    1000    -   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; 
chr1A   StringTie   exon    4409    4527    1000    -   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; exon_number "1"; 
chr1A   StringTie   exon    4556    5266    1000    -   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; exon_number "2"; 
chr1A   StringTie   transcript  4409    4720    1000    -   .   gene_id "MSTRG.4"; transcript_id "C1_00020C_A-T"; ref_gene_id "C1_00020C_A"; 
chr1A   exon    4409    4720    1000    -   .   gene_id "MSTRG.4"; transcript_id "C1_00020C_A-T"; exon_number "1"; ref_gene_id "C1_00020C_A";

when I use Ballgown for differential expression,I get the the file ballgown.gtf ,the gene_id is also STRG flag not ref_gene_id.the file is below

chr1    StringTie   transcript  126521  126612  .   +   .   gene_id "MSTRG.71"; transcript_id "C1_00720W_A-T"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    StringTie   exon    126521  126556  .   +   .   gene_id "MSTRG.71"; transcript_id "C1_00720W_A-T"; exon_number "1"; cov "0.0";
chr1    StringTie   exon    126577  126612  .   +   .   gene_id "MSTRG.71"; transcript_id "C1_00720W_A-T"; exon_number "2"; cov "0.0";
chr1    StringTie   transcript  204985  205057  .   +   .   gene_id "MSTRG.103"; transcript_id "C1_01020W_A-T"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    StringTie   exon    204985  205057  .   +   .   gene_id "MSTRG.103"; transcript_id "C1_01020W_A-T"; exon_number "1"; cov "0.0";
chr1    CGD transcript  296046  296504  .   +   .   gene_id "C1_01500W_A"; transcript_id "C1_01500W_A-T"; cov "0.0"; FPKM "0.000000"; TPM "0.000000";
chr1    CGD exon    296046  296504  .   +   .   gene_id "C1_01500W_A"; transcript_id "C1_01500W_A-T"; exon_number "1"; cov "0.0";

how can I get the file that geneid is ref_gene_id? help me for this thank you very much !!!

rna-seq • 335 views
ADD COMMENTlink modified 7 months ago by MatthewP680 • written 7 months ago by yiren0
1
gravatar for MatthewP
7 months ago by
MatthewP680
China
MatthewP680 wrote:

Hey, I recommend you check your input gtf file and annotation gtf/gff file, maybe your input gtf chromosome name is like chr1 style but your annotation file is 1 style.

ADD COMMENTlink written 7 months ago by MatthewP680

thank you for you reply.input gtf file is same as annotation file .Some genes geneid are ref_gene_id,but almost genes geneid are STRG flag.

ADD REPLYlink written 7 months ago by yiren0
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1165 users visited in the last hour