Question: extraction of original gene iDs from reference annotation file
0
gravatar for blooming.daisy333
9 months ago by
blooming.daisy33360 wrote:

hi,

i have used stringtie for the transcript assembly. stringtie is assigning its own labels (i.e gene IDs and transcript IDs) whle I need original gene IDs. can someone kindly suggest the way to get original IDs for the assembed transcripts from genome annotation file?? the stringtie output and genome annotation look like this:

stringtie output file:;

chr1    StringTie   transcript  328661  330868  1000    +   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; 
chr1    StringTie   exon    328661  329729  1000    +   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; exon_number "1"; 
chr1    StringTie   exon    329840  330067  1000    +   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; exon_number "2"; 
chr1    StringTie   exon    330758  330868  1000    +   .   gene_id "MSTRG.4"; transcript_id "MSTRG.4.1"; exon_number "3"; 
chr1    StringTie   transcript  580963  583751  1000    -   .   gene_id "MSTRG.5"; transcript_id "MSTRG.5.1"; 
chr1    StringTie   exon    580963  582109  1000    -   .   gene_id "MSTRG.5"; transcript_id "MSTRG.5.1"; exon_number "1"; 
chr1    StringTie   exon    583479  583751  1000    -   .   gene_id "MSTRG.5"; transcript_id "MSTRG.5.1"; exon_number "2";

genome annotation file

chr4    GLEAN   mRNA    123284514   123288477   0.999991    -   .   ID=Cotton_A_18927_BGI-A2_v1.0;Name=Cotton_A_18927;source_id=CottonA_GLEAN_10022228;identical_support_id=CUFF67.1103.1;evid_id=Cot030308.1
chr4    GLEAN   CDS 123288376   123288477   .   -   0   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123287662   123287826   .   -   0   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123287427   123287536   .   -   0   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123287129   123287237   .   -   1   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123286939   123287051   .   -   0   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123286180   123286330   .   -   1   Parent=Cotton_A_18927_BGI-A2_v1.0
chr4    GLEAN   CDS 123284514   123285671   .   -   0   Parent=Cotton_A_18927_BGI-A2_v1.0
chr9    GLEAN   mRNA    17802711    17803334    1   +   .   ID=Cotton_A_16149_BGI-A2_v1.0;Name=Cotton_A_16149;source_id=CottonA_GLEAN_10030787;evid_id=Cot023903.1
chr9    GLEAN   CDS 17803146    17803334    .   +   0   Parent=Cotton_A_16149_BGI-A2_v1.0
chr9    GLEAN   CDS 17802984    17803035    .   +   1   Parent=Cotton_A_16149_BGI-A2_v1.0
chr9    GLEAN   CDS 17802711    17802862    .   +   0   Parent=Cotton_A_16149_BGI-A2_v1.0

thanks in anticipation

rna-seq • 500 views
ADD COMMENTlink written 9 months ago by blooming.daisy33360

Can you please upload the StringTie syntax?

ADD REPLYlink written 9 months ago by Nitin Narwade380

yes please

stringtie <aligned_reads.bam> [options]*

and here is its manual

http://ccb.jhu.edu/software/stringtie/index.shtml?t=manual

ADD REPLYlink modified 9 months ago • written 9 months ago by blooming.daisy33360
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1356 users visited in the last hour