Set point colour based on threshold in scatter plot
3
1
Entering edit mode
3.9 years ago
fi1d18 ★ 4.1k

Hi,

I have a set of samples and log10 total read counts of them in a data frame like below

> head(a)
    log10_total_counts
A1            6.468503
A10           6.565213
A11           6.752139
A12           5.078598
A2            6.277342
A3            6.473411
>

enter image description here

For instance red dots have less than 1000000 counts and black dots have more thsn 1000000 counts.

How could I repreduce this plot please?

R plot • 9.2k views
ADD COMMENT
7
Entering edit mode
3.9 years ago
seidel 9.9k

Something like the following would work with base R functions:

# make some data
a <- data.frame(sample=paste0("A", 1:50), log10_total_counts=rnorm(50, 6))

# get x points
x <- 1:length(a$log10_total_counts)

# create plot, drop x axis
plot(x, a$log10_total_counts, ylim=c(0,8), pch=19, xaxt="n", xlab="sample")
# add dashed line and X-axis labels with rotation
abline(h=6, lty="dashed")
axis(1,1:50, labels=a$sample, las=3)

# select points to color differently
LessThan6 <- a$log10_total_counts < 6
points(x[LessThan6], a$log10_total_counts[LessThan6], pch=19, col="red")
ADD COMMENT
7
Entering edit mode
3.9 years ago

Use ggplot2.

Something is missing in your example; the labels at the bottom do not match you data frame, and it's unclear how your sample dataframe would count up to values near 1000000. You might want to precompute/add a QC vector based on that to your dataframe; basically have your dataframe in a state that's ready to be plotted with minimal calculations.

Here's an example; I generate a dataframe with normally distributed values centered on 6.0 (seemed close to what your example had). I plot a dotted line at 6; and anything below 6 becomes red.

Example ggplot call with geom_points and two colours.

For the style, I directly took the function theme_Publication() from this link.

Here's the code:

library("ggplot2")

# Themes for style
theme_Publication <- function(base_size=14, base_family="helvetica") { ... }
scale_fill_Publication <- function(...){ ... }
scale_colour_Publication <- function(...){ ... }

# Function to plot points as described
plotReads <- 
  function(DF) {
    ggplot(DF) + 
      theme_Publication() + # Style it
      geom_point(mapping = aes(x = as.factor(1:nrow(DF)), # Sequentially
                               y = log10_total_counts, # Y is the count values
                               color = QC)) + # Color it by the QC vector in the dataframe
      scale_color_manual(values = c("PASS" = "black", "FAIL" = "red")) + # Tell it what colours to use.
      geom_hline(yintercept = 6.0, linetype="dotted") + # Add the dotted line at 6.0
      xlab("Sample") + # Label the X
      ylab("log10(Total Reads)") + # Label the Y
      ylim(c(-0.2, 8.2)) # Extend the Y axis to be between this range.
  }

DF = data.frame( 
  log10_total_counts = rnorm(30, mean = 6, sd = 0.5), # Randomized data
  QC = as.factor(c( rep("FAIL",15), rep("PASS", 15) )) # A vector that is just half FAIL and half PASS
)
plotReads(DF)

DF$QC <- as.factor(ifelse(DF$log10_total_counts < 6.0, "FAIL", "PASS")) # Now actually add some logic to what should be PASS or FAIL.
plotReads(DF)
ADD COMMENT
1
Entering edit mode

None of the code uses anything from gridExtra - is it being loaded for code within the _Publication functions?

ADD REPLY
0
Entering edit mode

You're right, it's not needed here. Edited, and thanks!

ADD REPLY
0
Entering edit mode

Note: I cleaned up a discussion that sprang off an unwarranted comment of mine. Apologies!

ADD REPLY
4
Entering edit mode
3.9 years ago

another plot in ggplot2 with simulated data:

df=data.frame(genes=paste0("gene_", seq_along(1:75)), reads=round(rnorm(75, 110,2),0))
library(ggplot2)

ggplot(df,aes(genes, reads, color = ifelse( reads < 110, "Fail", "Pass")))+
    geom_point() +
  scale_y_continuous(limits = c(0, 200))+
  scale_color_manual(name="QC", values = c("red","darkgreen"))+
  geom_hline(yintercept = 110, color="red")+
  theme_bw()+
  xlab("Sample")+
  labs(y=expression("log"[10]("Total reads")))+
  theme(axis.text.x = element_text(angle = 45,hjust = 1),
        legend.position = "bottom")

Rplot01

ADD COMMENT
2
Entering edit mode

BIOSTARS I am so lucky that I can ask for your help and obtain such a nice solutions THANK YOU

ADD REPLY

Login before adding your answer.

Traffic: 2697 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6