In the Cuffdiff gene_exp.diff file, I am looking for genes with zero FPKM for one of my samples and some level of expression in the other sample.
If I was to take all of the genes with FPKM >0 for one sample and exactly zero FPKM for the other, then I would hopefully have all of the genes specifically expressed in one of the samples as required.
However, I would like to be more conservative than this - if there is, for example, an FPKM of 300 for one sample and 0.011291 for the other, one might conclude by eye that there is a chance this gene is actually specifically expressed in one of the samples and 0.011291 is perhaps just the result of the incorrect mapping of reads, in which case let's include this gene in the set of sample-specific expressed genes just in case.
But, if I wish to do this, I would have to make a somewhat arbitrary cut-off, I was thinking perhaps include those with <0.1 FPKM.
How else can I do this, or is it just a decision I will have to make and then report the method being used?
What might be a good cut-off to use?