I am working on a RNA-Seq data set from Drosophila Mel.
I have 10x3 replicates of 12 different time points we would like to cluster according to their expression behaviour over time. For that we use the mfuzz package, as we would like to use the soft clustering option. after several trial runs we decided to use the option of 40 clusters.
I have a few questions about the correct usage of the mfuzz package.
1. When we run mfuzz for the same data set, we get each time similar cluster, but in different order (e.g. in the first run "cluster 1" is "cluster 1", but in the next run for the same data set we get "cluster 1" to be "cluster 23". Is there a way to make sure that cluster 1 will always be a position 1 or this order is being set randomly?
2. As mentioned above we would like to use the soft clustering option, to see which genes can be in different clusters. Now we tried a low number of cluster to a very (over 200) high number of clusters. But even in the low number of cluster, we didn't have any duplications in the output (core list) of mfuzz. Is there a specific parameter to use, when running a soft clustering run, or does mfuzz has its own threshold to decide whether or not a gene can be put into several groups?
thanks in advance for any hints or help