Question: Cuffcompare: Unable To Map Reference Gene Names To Cufflinks Output
gravatar for joseph.elsherbini
6.3 years ago by
United States
joseph.elsherbini10 wrote:

I ran cufflinks on three bacterial RNA-seq samples, and want to use cuffcompare to get the "union" of the transcripts and to map the transcripts to the annotated reference genome. I dod get the unioned list of transcripts (which I will use with cuff diff to look for differentially expressed transcripts) but the genes are not annotated. Can anyone suggest what the issue might be?

tl;dr: My cuffcomapre output has XLOCXXXXXX as gene ids instead of the reference gene name from the annotation file.

The command I have tried is

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf' 

The genome.gff file looks like the following:

head genome.gff > 
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region NC_002505.1 1 2961149
NC_002505.1    RefSeq    region    1    2961149    .    +    .    ID=id0;Dbxref=taxon:243277;Is_circular=true;biotype=El Tor;chromosome=I;gbkey=Src;genome=chromosome;mol_type=genomic DNA;old-name=Vibrio cholerae O1 biovar eltor str. N16961;serotype=O1;strain=N16961
NC_002505.1    RefSeq    gene    235    402    .    -    .    ID=gene0;Name=VC0001;Dbxref=GeneID:2614109;gbkey=Gene;locus_tag=VC0001

And the transcripts.gtf files look like:

head 'A2_cuffout/transcripts.gtf >
gi|15600771|ref|NC_002506.1|    Cufflinks    transcript    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    286    994    1000    .    .    gene_id "CUFF.1"; transcript_id "CUFF.1.1"; exon_number "1"; FPKM "4.5531694820"; frac "1.000000"; conf_lo "2.465004"; conf_hi "4.663521"; cov "19.278874";

my outputted combined.gtf file looks like:

gi|15600771|ref|NC_002506.1|    Cufflinks    exon    10    4310    .    .    .    gene_id "XLOC_000001"; transcript_id "TCONS_00000457"; exon_number "1"; oId "CUFF.1.1"; class_code "."; tss_id "TSS1";
gi|15600771|ref|NC_002506.1|    Cufflinks    exon    4500    8484    .    .    .    gene_id "XLOC_000002"; transcript_id "TCONS_00000005"; exon_number "1"; oId "CUFF.5.1"; class_code "u"; tss_id "TSS6";

So the gene id is XLOC_000001 instead of VC0001 or something similar.

gff cufflinks • 3.7k views
ADD COMMENTlink modified 6.1 years ago by madkitty600 • written 6.3 years ago by joseph.elsherbini10
gravatar for madkitty
6.1 years ago by
madkitty600 wrote:

I see a ' missing in your command, see highlighted part

cuffcompare  -r 'genome.gff' 'A2_cuffout/transcripts.gtf' EA349_1cuffout/transcripts.gtf' 'EA349_2cuffout/transcripts.gtf'

maybe it couldn't read properly your reference file ..

ADD COMMENTlink modified 5 months ago by RamRS27k • written 6.1 years ago by madkitty600
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1595 users visited in the last hour