Question: Cuffcompare: Unable To Map Reference Gene Names To Cufflinks Output
1
gravatar for joseph.elsherbini
5.4 years ago by
United States
joseph.elsherbini10 wrote:

I ran cufflinks on three bacterial RNA-seq samples, and want to use cuffcompare to get the "union" of the transcripts and to map the transcripts to the annotated reference genome. I dod get the unioned list of transcripts (which I will use with cuff diff to look for differentially expressed transcripts) but the genes are not annotated. Can anyone suggest what the issue might be?

tl;dr: My cuffcomapre output has XLOCXXXXXX as gene ids instead of the reference gene name from the annotation file.

The command I have tried is

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf' 

The genome.gff file looks like the following:

head genome.gff > 
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region NC_002505.1 1 2961149
##species http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=243277
NC_002505.1    RefSeq    region    1    2961149    .    +    .    ID=id0;Dbxref=taxon:243277;Is_circular=true;biotype=El Tor;chromosome=I;gbkey=Src;genome=chromosome;mol_type=genomic DNA;old-name=Vibrio cholerae O1 biovar eltor str. N16961;serotype=O1;strain=N16961
NC_002505.1    RefSeq    gene    235    402    .    -    .    ID=gene0;Name=VC0001;Dbxref=GeneID:2614109;gbkey=Gene;locus_tag=VC0001
...

And the transcripts.gtf files look like:

head 'A2_cuffout/transcripts.gtf >
gi|15600771|ref|NC_002506.1|    Cufflinks    transcript    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; exon_number "1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";
...

my outputted combined.gtf file looks like:

gi|15600771|ref|NC_002506.1|    Cufflinks    exon    10    4310    .    .    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000457"; exon_number "1"; oId "CUFF.1.1"; class_code "."; tss_id "TSS1";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    4500    8484    .    .    .    gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "1"; oId "CUFF.5.1"; class_code "u"; tss_id "TSS6";

So the gene id is XLOC_000001 instead of VC0001 or something similar.

gff cufflinks • 3.4k views
ADD COMMENTlink modified 5.2 years ago by madkitty580 • written 5.4 years ago by joseph.elsherbini10
0
gravatar for madkitty
5.2 years ago by
madkitty580
Canada
madkitty580 wrote:
I see a ' missing in your command, see bold part

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf' 

 

maybe it couldn't read properly your reference file .. 

ADD COMMENTlink written 5.2 years ago by madkitty580
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 768 users visited in the last hour