DESeq2 MA plot label subset of genes
2
1
Entering edit mode
17 months ago
kstangline ▴ 50

Hello,

I'm new to R and I'm trying to make a MA plot from my DESeq2 results using ggplot2.

I have figured out how to make a MA plot using the following code:

plot_poly <-
all_counts.poly.results %>%
as.data.frame() %>%
ggplot(aes(log2(baseMean), log2FoldChange) +
geom_point(aes(color = pvalue < 0.05), cex = 0.1) +
labs(title = "Poly Torin Treated vs Untreated")


Inside the all_counts.poly.results are EnsGeneIDs I'm interested in labeling on the graph, but there are too many to plot, so want to filter this against an excel file with specific EnsGeneIDs.

For example, I was thinking about setting it up like this with dplyr, but I'm not sure if this is correct.

# contains EnsGeneIDs I want to be plotted

filtered_all_counts.poly.results <- all_counts.poly.results %>%
filter(all_counts.poly.results$EnsGeneIDs %in% EnsGeneIDs) # filter only specific EnsGeneIDs  Then use these filtered EnsGeneIDs as labels on the MA plot I made above? RNA-Seq deseq R • 1.1k views ADD COMMENT 0 Entering edit mode Have you tried running that and it didn’t work? I can’t tell what exactly you’re asking. Do you want to only plot those genes, or do you want to plot all but only label those ones? Side note: you can just put the bare column name in the filter call without the df$, although I would recommend changing the name of the EnsGeneIDs object so it differs from the column name. Also, if EnsGeneIDs reads in as a dataframe (rather than a vector) you may need to say %in% EnsGeneIDs$V1 (if that’s the column’s name) or convert it to a vector. ADD REPLY 0 Entering edit mode Thanks for the reply! I want to plot all the genes, and I only want to label a few (about 10 genes out of the thousands that ggplot2 plots), hence why I wanted to filter my all_counts.poly.results dataframe with the excel I read in. ADD REPLY 4 Entering edit mode 17 months ago loughrae ▴ 90 To label specific points, you can add a new column to all.counts.poly.results where you test whether the gene is in the list and if so the value is the gene name (whatever you want the label to be) and if not it’s empty: [convert poly to df and IDs to vector if needed]  all.counts.poly.results$mark <- ifelse(all.counts.poly.results$EnsGeneIDs %in% IDs, all.counts.poly.results$EnsGeneIDs, ‘’)
ggplot(all.counts.poly.results, aes(...) + geom_point(...) + geom_text(aes(label = mark))


You could also try putting NA instead of ‘’ in the ifelse().

0
Entering edit mode
9 months ago

Just to add a little bit more depth to the answer by __@loughrae__

You can also use case_when from __dplyr__ to make the new variable in a more readable fashion

library(dplyr)
all.counts.poly.results %>%
mutate (mark=case_when(
EnsGeneIDs %in% IDs ~as_character(EnsGeneIDs),
TRUE ~ NA_character))


And for plotting we can use an application of __ggrepel__ library to produce more readable plots.

 library(ggrepel)
gplot(all.counts.poly.results, aes(..., label=mark) +
geom_point(...) +
geom_label_repel (box.padding = 0.5, max.overlaps = Inf)
`

You can play a bit with the options of __ggrepel__ until you get what you desire