I have recently encountered some CyTOF data (human) and I'm trying to determine the best tools available for analysis. I found a couple of review articles but I wanted to get folks' advice on what tools they are happy with and what works the best. I'm open to commercial or free/open source tools. I would prefer if it was written in R so I can configure it . One specification is the tool would need to be able to handle larger , high throughput data sets. I reviewed Catalyst package in bioconductor which looks good. Any suggestions folks have is appreciated. Thanks -Rich
It is possible to process the data using custom scripts, as I show here: https://github.com/kevinblighe/cytofNet
Other than that, from what I can see, the majority of people use CytoBank, which was recently sold by its co-founders. So, I don't know what that means for those users who wish to use CytoBank in the future. Neither do I know what it means for those users whose data has been deposited in CytoBank in good faith.
reading in data
Arguably the most important R package that you'll need is flowCore, which can read the FCS files for you. Everything after that really can be done manually.
After reading in the data, you need to normalise it and eliminate junk cells. As you know, CyTOF data is typically normalised by hyperbolic arc-sine with a factor of 5.
# Set background noise threshold - values below this are set to 0 BackgroundNoiseThreshold <- 1 # Euclidean norm threshold - this is the square root of the sum of all the squares EuclideanNormThreshold <- 1 # Choose a transformation function (any mathematical function) transFun <- function (x) asinh(x) # Set hyperbolic arc-sine factor (NB - asinh(x/5) is recommended for CyTOF and FACS data) asinhFactor <- 5 x <- x[apply(x, 1, FUN=function(x) sqrt(sum(x^2)))>EuclideanNormThreshold,] NoiseCorrected <- x NoiseCorrected[NoiseCorrected<BackgroundNoiseThreshold] <- 0 x <- transFun(NoiseCorrected/asinhFactor)
Once you read in and normalise your data, you can bind samples together that are representing common conditions. The World, after that, really is your oyster:
- tSNE (Rtsne)
- UMAP (umap)
- PhenoGraph (Rphenograph)
clustering and heatmaps
You can even utilise Seurat functionality to identify clusters in your data, specifically
Here are some products of my own CyTOF scripts: