I am having trouble getting Cufflinks to report my transcripts using the accession numbers provided by my *.gtf file. The 9th column of my *gtf file looks like this: geneid "ACYPI006883"; transcriptid "ACYPI006883"; gene_name "ACYPI006883"; so I believe that the *gtf file is properly formatted.
I have run with both the -g and -G options. When I run the -G option, it lists the transcripts with the gene ids and all the FPKM = 0. Nothing else is reported. When I run the -g option, it lists the named genes as FPKM = 0 (as with the -G option) and all the CUFF_ID genes have FPKM numbers. It's almost as if Cufflinks is telling me that it cannot match any of the assembled transcripts to my *gtf file.
I have also tried running Cuffcompare using the -r option, thinking that it might be able to link up the accession numbers, but it doesn't.
Any suggestions?
My fasta genome scaffold names are not the same as the first column of my gtf file. Should these be exactly* the same?
In my experience yes. I believe an exact string match is how cufflinks figures out which annotations belong to which scaffold/replicon.