Question: Plotting the expression values for two genes in a range of cells
0
Za130 wrote:

Hi,

How I can plot like below picture for my data?

y axis shows the read counts range from 0 to 10000 and x axis shows the number of cells in this range (I think cells have been ordered by falling for the expression of one gene in contrast to another one). For example likely 10 cells express this gene with 10000 read counts. So that there are two colour bars each colour for a gene.

``````> data
cells gene1 gene2
1     cell1   1040    138
2     cell2   1378   1444
3     cell3     49     49
4     cell4   1660    502
5     cell5   1920     57
6     cell6     85     52
7     cell7    230    212
8     cell8   5567   2147
9     cell9    124    305
10   cell10    117    167
11   cell11     78    538
12   cell12   1240    298
13   cell13   2374   3656
14   cell14    489   1547
15   cell15   1042    752
16   cell16   4648    181
17   cell17   3109    513
18   cell18    354    645
19   cell19   1106    639
20   cell20   1260    692
21   cell21    727   1249
22   cell22   1510   3997
23   cell23      1   1159
24   cell24     43    147
25   cell25    226    356
26   cell26   3183   1089
27   cell27    397    286
28   cell28   1089    593
29   cell29    497   1055
30   cell30    531     18
31   cell31   1924   1088
32   cell32     58    232
``````
ggplot2 R • 1.5k views
modified 2.3 years ago • written 2.3 years ago by Za130
1

It is not clear what you would like to see in the plot. Do you want to plot the values as such or break down the cells into ranges, then plot ? (as you have used range in OP). If it is range, how would you sum/average the expression of genes? Please add more details to the post.

see if this is what you are looking for: Since grouping information and average/sum per group is not provided individual cell values are plotted

Thanks a lot, I would like to order my 209 cells based on falling expression of each gene separately. I think you plotted the right thing but please order cells for one gene. I think another plot with ordered cells in x axis is needed for another gene.

You should paste your tabular data in a GitHub Gist and add the link here, or post your data in a code segment here. I've also changed your post so the image is displayed in your post.

5
egeulgen1000 wrote:

You can plot such a "lollipop plot" using ggplot2:

``````cell_data <- read.csv("~/Downloads/data.csv")

library(reshape2)
library(ggplot2)

cell_data_molten <- melt(cell_data)

cell_data\$Total <- cell_data\$Gene.A + cell_data\$Gene.B
for_levels <- cell_data\$Cell[order(cell_data\$Total)]

cell_data_molten\$Cell <- factor(cell_data_molten\$Cell, levels = for_levels)

g <- ggplot(cell_data_molten, aes(Cell, value))
g <- g + geom_linerange(aes(x = Cell,
ymin = 0, ymax = value,
color = variable), position = position_dodge(width = 1))
g <- g + geom_point(aes(color = variable), position = position_dodge(width = 1))
g <- g + scale_y_continuous(expand = c(0, 0))
g <- g + theme_bw()
g <- g + theme(axis.text.x = element_text(angle = 90, hjust = 1))
g
``````

Produces the below plot: Thank you, in two parts I am getting error

``````    > cell_data_molten\$Cell <- factor(cell_data_molten\$Cell, levels = for_levels)
Error in `\$<-.data.frame`(`*tmp*`, Cell, value = integer(0)) :
replacement has 0 rows, data has 209

> g <- g + theme(axis.text.x = element_text(angle = 90, hjust = 1))
> g
>

Gene.A Gene.B
cell1   1040    138
cell2   1378   1444
cell3     49     49
cell4   1660    502
cell5   1920     57
cell6     85     52
>
``````

you have to have a column called `Cell` filled with the row names, i.e. `"cell1, cell2, ..."` should be in a column named `Cell`

2
Alex Reynolds31k wrote:

I can't see the image in your post. That said, perhaps look into a violin plot, where the y-axis is expression, and the x-axis consists of two genes: one "violin" for each gene. See: https://ggplot2.tidyverse.org/reference/geom_violin.html for a reference and example plots.

Sorry, this is the link of image

I want to cells be ordered based on the expression one gene, so that I have 10 cells on x axis with 10000 read counts for one gene and etc

1

It is not immediately obvious how you want to break down groups of cells?

For instance, your example Google Chart spreadsheet has 209 cells. You ask for a plot that shows expression levels for ten cells? How are you grouping levels of 209 cells into ten-cell groups? What function are you using to do that reduction?

I think if you provide a minimally-reproducible example, it would help others help provide you with code. Even a very-small chart and a sketch of your plot would help.

Thank you, actually I have the expression values of 209 cells but here I pasted 30 cells for representing the structure of my data. X axis would be ordered cells for one gene in contrast to the other one in different colours. I think maybe two plots needed for falling of expression in each gene

2

``````library(readxl)
library(dplyr)
library(tidyr)
library(ggplot2)
names(df1)=c("cells","geneA","geneB")
df2=df1 %>%gather("gene","expression", -cells) %>% arrange(gene,expression) %>% mutate (cells = factor(cells, unique(cells)))
df2\$gene=as.factor(df2\$gene)

ggplot(df2, aes(x=cells, y=expression)) +
geom_point(aes(color=gene, size=3), position=position_dodge(width=.6))+
geom_linerange(aes(ymin=0, ymax=expression,color=gene),position = position_dodge(.6))+
theme_bw()+
theme(
legend.position = "none",
panel.grid.major = element_blank(),
panel.grid.minor = element_blank(),
axis.text.x = element_text(margin=margin(t=20),size=12,angle = 45),
axis.text.y = element_text(size=18),
axis.title.y = element_text(size=18),
axis.title.x = element_text(size=18))+
scale_color_manual(values=c("darkgreen", "red"))+
ylab("Expression")+
xlab("Cells")
`````` 1

Might be worth it to plot log10(gene1/gene2) of each cell. That might more clearly show differences.

That would be lfc (base 10)

Sorry how could I show the expression of both genes side by side. I mean something like each of them on one axis because here the compressed expression of one gene does not allow for viewing the comparison of their expression levels.

breaks <- as.vector(c(1, 2, 5) %o% 10^(-1:1))

geom_point() + scale_y_log10(breaks = breaks)

Yeah thank you very much

may you please let me know the code by which you have plotted this?? you know, on x axis if I have 200 cells, like the example image if x axis be considered as 20 part (in each part 10 cells), instead if names of cells in each range. For instance 0 - 10 cells have read counts 10000, 10-20 cells have read counts < 10000 and so on

1

updated the image with code.