Question: Adding p value to Vlnplot in Seurat
0
gravatar for ww22runner
6 months ago by
ww22runner0
ww22runner0 wrote:

Hello everyone,

Could anyone help me understand if there is any way I can add p values to violin plots produced by Seurat? Currently, I am trying to plot several genes across all clusters, broken up by responders vs non-responders and this is my code:

vp_case1 <- function (gene_signature, file_name){
  plot_case1 <- function(signature){
    VlnPlot(pbmc, features = signature,
            pt.size = 0.1, 
            group.by = "Response"
            ) + stat_compare_means(comparisons = pbmc$Response, label = "p.signif")
  }

  map(gene_signature, plot_case1) %>% cowplot::plot_grid(plotlist = .)
  file_name <- paste0(file_name, "_r.png")
  ggsave(file_name, width = 14, height = 8)
}

 vp_case1(gene_sig, "gene_sig")

where gene_sig is a vector of a few genes of interest.

However, I get the following error:

Warning messages:
1: Computation failed in `stat_signif()`:
not enough 'y' observations 
2: Computation failed in `stat_signif()`:
not enough 'y' observations 
.
.
.

How could I do this better?

Thank you!

seurat • 1.4k views
ADD COMMENTlink modified 6 months ago by antonioggsousa2.0k • written 6 months ago by ww22runner0

Have you checked if VlnPlot and stat_compare_means can work together like that?

ADD REPLYlink written 6 months ago by igor12k

Hi igor,

Please correct me if I am wrong but it says VlnPlot returns "a patchworked ggplot object if combine = TRUE; otherwise, a list of ggplot objects" while stat_compare_mean can be used to "add mean comparison p-values to a ggplot" so I thought it should work.

ADD REPLYlink written 6 months ago by ww22runner0
4
gravatar for antonioggsousa
6 months ago by
antonioggsousa2.0k
antonioggsousa2.0k wrote:

Hi,

The problem that you got is related with the fact that you need to provide a list of comparisons to be made to the function stat_compare_means(), and you are providing a character/factor variable. So, if you adjust your code to:

vp_case1 <- function(gene_signature, file_name, test_sign, y_max){
plot_case1 <- function(signature){
    VlnPlot(pbmc, features = signature,
            pt.size = 0.1, 
            group.by = "Response", 
            y.max = y_max, # add the y-axis maximum value - otherwise p-value hidden
    ) + stat_compare_means(comparisons = test_sign, label = "p.signif")
  }
  map(gene_signature, plot_case1) %>% cowplot::plot_grid(plotlist = .)
  file_name <- paste0(file_name, "_r.png")
  ggsave(file_name, width = 14, height = 8)
}

In this adjusted code I added 2 extra parameters:

  • test_sign: that is a list of a vector of characters. So, in your case you have a Response variable that should assume 2 or more levels. Let's say that your Response variable can be CTRL or TRT (control or treatment). Thus test_sign = list(c("CTRL", "TRT")).

  • y_max: by default the function VlnPLot will adjust the y-axis to the maximum expression value per comparison made. What this means is that if you're comparing the gene A between CTRL and TRT and the max. value is 7, the maximum y-axis scale will be 7. When you apply the stat_compare_means() function, the p-value will be added above, but since the maximum y-axis value is 7, you'll not see it. Therefore, you can adjust the y_max to one more unit, let's say 8, and therefore you'll be available to see the p-value. Though as you can think this is not the best option since often you'll have distinct y maximum values, and therefore setting the same y_max will not be the best option. In that case what you can do is just a loop instead a map() giving a y_max a numeric vector. Let me know if you have problems regarding this issue.

Here is an example:

## Follow the Seurat tutorial until UMAP at: https://satijalab.org/seurat/v3.2/pbmc3k_tutorial.html

## create fake 'Reponse' variable
pbmc@meta.data$Response <- rep(c("CTRL", "TRT"), times = c(1500, 2638-1500))

vp_case1 <- function(gene_signature, file_name, test_sign, y_max){
  plot_case1 <- function(signature){
    VlnPlot(pbmc, features = signature,
            pt.size = 0.1, 
           group.by = "Response", 
            y.max = y_max, # add the y-axis maximum value - otherwise p-value hidden
    ) + stat_compare_means(comparisons = test_sign, label = "p.signif")
  }
  purrr::map(gene_signature, plot_case1) %>% cowplot::plot_grid(plotlist = .)
  file_name <- paste0(file_name, "_r.png")
  ggsave(file_name, width = 14, height = 8)
}

gene_sig <- c("NKG7", "PF4")
comparisons <- list(c("CTRL", "TRT"))
vp_case1(gene_signature = gene_sig, file_name = "Desktop/gene_sig", test_sign = comparisons, y_max = 7)

Result:

enter image description here

I hope this answers your question,

António

ADD COMMENTlink written 6 months ago by antonioggsousa2.0k

Code updated below to give you an adjusted y-axis scale with p-value:

## Follow the Seurat tutorial until UMAP at: https://satijalab.org/seurat/v3.2/pbmc3k_tutorial.html

## create fake 'Reponse' variable
pbmc@meta.data$Response <- rep(c("CTRL", "TRT"), times = c(1500, 2638-1500))

vp_case1 <- function(gene_signature, file_name, test_sign){
  plot_case1 <- function(signature, y_max = NULL){
    VlnPlot(pbmc, features = signature,
            pt.size = 0.1, 
            group.by = "Response", 
            y.max = y_max # add the y-axis maximum value - otherwise p-value hidden
    ) + stat_compare_means(comparisons = test_sign, label = "p.signif")
  }
  plot_list <- list()
  y_max_list <- list()
  for (gene in gene_signature) {
    plot_list[[gene]] <- plot_case1(gene)
    y_max_list[[gene]] <- max(plot_list[[gene]]$data[[gene]]) # get the max no. for each gene
    plot_list[[gene]] <- plot_case1(gene, y_max = (y_max_list[[gene]] + 1) )
  }
  cowplot::plot_grid(plotlist = plot_list)
  file_name <- paste0(file_name, "_r.png")
  ggsave(file_name, width = 14, height = 8)
}

gene_sig <- c("MS4A1", "GNLY", "CD3E", "CD14", "FCER1A", "FCGR3A", "LYZ", "PPBP", "CD8A")
comparisons <- list(c("CTRL", "TRT"))
vp_case1(gene_signature = gene_sig, file_name = "Desktop/gene_sig_2", test_sign = comparisons)

Result:

enter image description here

Therefore, you don't need to worry with the y_axis scale.

António

ADD REPLYlink written 6 months ago by antonioggsousa2.0k

Hi Antonio,

Thanks so much, these graphs are exactly what I needed! May I ask if we might be able to show the same information on a split violin plot for each gene for responder vs non-responders for a particular cluster?

Thank you!

ADD REPLYlink modified 6 months ago • written 6 months ago by ww22runner0

Hi ww22runner,

Not sure. Did you try?

Let me know if it worked.

António

ADD REPLYlink written 6 months ago by antonioggsousa2.0k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 2113 users visited in the last hour
_