Question

Understanding learn_graph and ncenter in Monocle3

1

Entering edit mode

11 months ago

kkbarratt19 ▴ 10

Hi all

I have some questions about how Monocle3's learn_graph function and ncenter parameters work.

Background: I'm using Monocle3 to calculate trajectories for a dataset that consists of two cohorts - control and treated. Cleaning, dimensionality reduction and clustering were all done in Seurat on combined data and then it was subsetted into my two cohorts and transferred to monocle3. Our biological data and initial seurat analysis suggests that one of the treated cohort's cell populations are differentiating much slower than in the control, and this is causing delays for downstream lineages.

Ultimately, I'd like to calculate trajectories for both cohorts and then compare the trajectory patterns and pseudotimes. But before I can figure out how to interpret any trajectory differences, I need to understand more about the learn_graph function.

Specifically, I'd like to know:

if I use learn_graph without also specifying learn_graph_control=list(ncenter=1000), what is the default ncenter value that is used to calculate the trajectory?
what is the ncenter value actually referring to? I saw a suggestion somewhere that it was the number of cells that learn_graph uses to calculate the trajectory, so a lower ncenter number = a less complex trajectory as it is uses fewer cells, however I could not find this suggestion confirmed anywhere.
how do we know the optimal ncenter value to use for our data? In my own data a ncenter value of 650 in my control cohort generates a completely different major branch point compared to using 640 or 660 for the same cohort. There is no current biological data that can tell me which is the most likely trajectory.
Since I have two different cohorts I want to compare, it is best practice to use the same ncenter value for both cohorts or should the ncenter be specific for each cohort? e.g. based on the number of cells per cohort.

I also have some questions about labelling the branch points once the trajectory is calculated - namely, does a value of 1 mean it is the first branch followed by 2 and so on?

Fingers crossed someone can answer or direct me to the answer in the Monocle3 documentation.

Thanks for your help!

scRNA-seq monocle3 seurat trajectory • 971 views

ADD COMMENT • link updated 11 months ago by Kyle ▴ 10 • written 11 months ago by kkbarratt19 ▴ 10

score 0 · Answer 1 · 2023-05-22

Hi,

Not a dev, but:

There is no documentation in the source code of monocle3/R/learn_graph.r. Other parameters are described but the n_graph description is empty.

The default ncenter value is null (meaning empty). If a ncenter value is not provided it calculates the number of clusters in the partition via:

 if(is.null(ncenter)) {
  num_clusters_in_partition <-
    length(unique(clusters(cds, reduction_method)[colnames(X_subset)]))
  num_cells_in_partition = ncol(X_subset)
  urr_ncenter <- ncol(X_subset) - 1
  }

Then it calculates the cal_ncenter uisng the following function:

cal_ncenter <- function(num_cell_communities, ncells,
                    nodes_per_log10_cells=15) {
round(num_cell_communities * nodes_per_log10_cells * log10(ncells))
}

Secondly, you're better off to try and figure out whether there is a metric for fitting the model and use that metric to fit the model. Normally trajectory analysis has some sort of loss function and the lower the metric of that function, the better the fit.

Finally, the branch_points are are a bit hard to figure out but I believe the order is determined by the weight of the branch. Saying that the first branch of a tree should have the most weigth (so it will labelled as 1 etc.).