I'm working with a data set that is missing a lot of data due to quality issues. Therefore many of the transcript FPKM values are scored as 0. As a result, this appears to confound the significance matrix and I end up with thousands of genes marked as significant at alpha=0.05.
What I would like to do is filter the cuffset to exclude those values which are 0 across all samples (rows) or which are 0 in the query sample.
My current approach is a round about way of generating a filtered cuffgeneset but the sigMatrix() function only has an implementation for cuffset objects so I cannot generate the matrix with the cuffgeneset.
My strategy is as follows:
#get gene matrix for all
#score for any row where all values are 0, or query samples are 0
> test <- apply(gene.matrix, 1, function(x) all(x[1:5]==0) | x == 0 | x == 0)
#apply to matrix
> test1 <- gene.matrix[!test,]
#get significantly regulated genes
#get common list of gene names that are significant and where value of query is not 0
> test4 <- Reduce(intersect, list(mySigGeneIds,rownames(test1)))
#build a gene set of those that are significantly regulated and for which we have a value for query
I'm wondering if you know of a way to apply a filter to the cuffset object to get a subset cuffset instead. Or is there a way to generate a sigmatrix from a cuffgeneset?
Thanks for your time,