To make transcriptome assembly, I mapped the reads (2x75, strand-specific, organism Candida albicans) with tophat2 and did assembly with cufflinks. The command line for the latter is
cufflinks --library-type fr-firststrand -o folder -g C_alb.gff -p 20 accepted_hits.bam
In final file genes.fpkm_tracking I have only 713 transcripts, while my gff contains around 6200 transcripts.
Option -g "Tells Cufflinks to use the supplied reference annotation a GFF file to guide RABT assembly. Reference transcripts will be tiled with faux-reads to provide additional information in assembly. Output will include all reference transcripts as well as any novel genes and isoforms that are assembled." So in theory in genes.fpkm_tracking I should observe all transcript from gff + new transcripts (if any).
Can somebody explain why I see less transcripts then in gff?