Question: How I reproduce such a plot
0
gravatar for A
10 months ago by
A3.7k
A3.7k wrote:

Hi,

I have a list of differentially expressed genes (DEGs) from single cell RNA-seq between two clusters of cells. I have also a list of differentially expressed proteins (DEPs) from proteomics . I want to classify DEGs and DEPs, and their overlap into individual functional groups something like below picture but I don't know how. I know how to classify them individually but I need a picture shows all together. For example GO terms for DEGs, GO terms fro DEPs and GO terms for their overlap enter image description here

Any idea?

ADD COMMENTlink modified 10 months ago by SMK1.9k • written 10 months ago by A3.7k

Divide it into its components:

  1. stacked bar-plot, rotated horizontally
  2. Venn diagram
ADD REPLYlink written 10 months ago by Kevin Blighe56k

Thank you; Supposing 100 DEGs , 200 DEPs and 70 overlap, they are being classified into different Terms so how I select which term for plotting?

ADD REPLYlink written 10 months ago by A3.7k

I'm not sure what the message is behind that plot, what do you want to show?

ADD REPLYlink written 10 months ago by WouterDeCoster43k

The relationship between the transcriptome and proteome data

ADD REPLYlink written 10 months ago by A3.7k
4
gravatar for SMK
10 months ago by
SMK1.9k
SMK1.9k wrote:

Hi F,

They can be reproduced using ggplot and VennDiagram::draw.pairwise.venn:

library(tidyverse)
library(VennDiagram)
library(GO.db)

# Grap some example from E. coli
gene2go <- read_tsv("https://www.uniprot.org/uniprot/?query=organism:83333&format=tab&columns=id,go-id")
colnames(gene2go) <- c("Gene", "GO")
DECs <- gene2go[sample(nrow(gene2go), 500),]
DEPs <- gene2go[sample(nrow(gene2go), 500),]

# Calcuate sets
sets <- calculate.overlap(x = list("DECs" = DECs$Gene,
                                   "DEPs" = DEPs$Gene))
Overlap <- sets$a3
DECs_only <- setdiff(sets$a1, Overlap)
DEPs_only <- setdiff(sets$a2, Overlap)
df_sets <- rbind(
  data.frame(Type = rep("Overlap", length(Overlap)), Gene = Overlap),
  data.frame(Type = rep("DECs_only", length(DECs_only)), Gene = DECs_only),
  data.frame(Type = rep("DEPs_only", length(DEPs_only)), Gene = DEPs_only)
)

# Combine with GO data and flatten GO
df_sets_go <- left_join(df_sets, gene2go, by = "Gene") %>% separate_rows(., "GO", sep = "; ")
df_sets_go$Description <- Term(df_sets_go$GO)
levels(df_sets_go$Type) <- as.vector(c("DECs", "DEPs", "Overlap"))

# Only look at top 20 GO terms
GO_top20 <- t(t(sort(table(df_sets_go$GO)))) %>% tail(20) %>% row.names()

# Barplot
ggplot(filter(df_sets_go, GO %in% GO_top20), aes(str_to_sentence(Description))) +
  geom_bar(color = "black", aes(fill = Type)) +
  coord_flip() +
  theme_bw() +
  scale_fill_manual(values = c(
    "DECs" = "black",
    "DEPs" = "white",
    "Overlap" = "grey"
  )) +
  scale_y_continuous(expand = c(0, 0)) +
  xlab("") +
  ylab("Number of DECs or DEPs") +
  theme(legend.position = "top",
        legend.title = element_blank())

# Venn diagram for the whole sets (not only the genes in GO barplot)
draw.pairwise.venn(
  area1 = length(DECs_only),
  area2 = length(DEPs_only),
  cross.area = length(Overlap),
  category = c("DECs", "DEPs")
)

barplot

venn-diagram

Hope it helps.

ADD COMMENTlink modified 10 months ago • written 10 months ago by SMK1.9k

Thank you, seems amazing but how I provide gene2go?

ADD REPLYlink written 10 months ago by A3.7k

Starting from the table which looks like this:

> head(as.data.frame(DECs))
    Gene                                                                                             GO
1 P0A7S9                         GO:0000049; GO:0003735; GO:0005829; GO:0006412; GO:0019843; GO:0022627
2 P0AFW0 GO:0001000; GO:0001073; GO:0001124; GO:0003677; GO:0005829; GO:0008494; GO:0031564; GO:0045727
3 P76000                                                                                     GO:0019867
4 P0A953                                                 GO:0004315; GO:0005829; GO:0006633; GO:0008610
5 Q9JMT8                                                                         GO:0003677; GO:0006355
6 P0AE34                                     GO:0005886; GO:0005887; GO:0022857; GO:0055052; GO:0097638

Please check the codes and see how each dataframe looks like.

ADD REPLYlink modified 10 months ago • written 10 months ago by SMK1.9k

Sorry I mean which tool you have used to produce the source of gene2go? Which functional annotation tool?

Also I mam getting this error

> sets <- calculate.overlap(x = list("DECs" = DECs$Gene,
+                                    "DEPs" = DEPs$Gene))
Error in calculate.overlap(x = list(DECs = DECs$Gene, DEPs = DEPs$Gene)) : 
  could not find function "calculate.overlap"

Which package gives this function?

ADD REPLYlink modified 10 months ago • written 10 months ago by A3.7k

You can use InterProScan.

ADD REPLYlink written 10 months ago by SMK1.9k

Did you install and load VennDiagram? https://www.rdocumentation.org/packages/VennDiagram/versions/1.6.20/topics/calculate.overlap

ADD REPLYlink modified 10 months ago • written 10 months ago by SMK1.9k

Thank you so much, I don't have a list of protein sequences rather I have a list of protein IDs that I have converted them to gene symbol. I also have a list of genes from single cell RNA-seq. The goal is to seeing the relationship of proteomics and single cell RNA-seq. For example how much GO terms or pathways are persistent in both data sets.

ADD REPLYlink written 10 months ago by A3.7k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 785 users visited in the last hour