Thanks for a great package.
In the tutorial you explain, the AUCell_run
command takes by default only 5% of the genes to calculate the scores per gene set for faster execution, but recommend to check the distribution of the genes before using it on "real" data.
In the tutorial this is done with the histogram and results of the AUCell_buildRankings
function.
This is how the quantiles table looks like for the tutorial data:
## Quantiles for the number of genes detected by cell:
## (Non-detected genes are shuffled at the end of the ranking. Keep it in mind when choosing the threshold for calculating the AUC).
## min 1% 5% 10% 50% 100%
## 193.00 271.08 364.20 447.40 921.00 2056.00
Do I understand it correctly, that according to this table, 5% of the cells can detect 364.2 genes per cell out of total of 2056 genes (which would be 100%)?
How do I decide on this value?
thanks for the help