Question: How to adjust and align timepoints on x-axis in the ggplot2
0
gravatar for mohammedtoufiq91
10 months ago by
mohammedtoufiq91110 wrote:

Hi,

I am working on the line plot using ggplot2 library. I notice that the data point are not aligned correctly on the x-axis (different timepoints). Below is the code that I ran in R and image of the line plot. In the Image, as shown data point from T5 is aligned on T6 and T20 is not aligned as well. Please let me know how to fix the issue.

Note: Some data points are indeed missing in the middle as they are not present in the dataframe.

str(B1_Patient_Module_ID_sorted)
'data.frame':   5016 obs. of  4 variables:
 $ Genes     : Factor w/ 264 levels "ABHD5","ACOT4",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ Timepoints: num  1 1 1 1 1 1 1 1 1 1 ...
 $ value     : num  -2.05 -8.36 -2.06 -3.84 -6.59 ...
 $ X20       : Factor w/ 66 levels "M10.1","M10.2",..: 53 59 53 53 44 6 29 12 29 19 ...
  ..- attr(*, "names")= chr  "ABHD5" "ACOT4" "ACTN4" "ACTR10" ...


pdf("B1_Module_v3.pdf", 7, 6)
for (i in seq(1, length(unique(B1_Patient_Module_ID_sorted$X20)), 1)) {
  print(ggplot(B1_Patient_Module_ID_sorted[B1_Patient_Module_ID_sorted$X20 %in% levels(B1_Patient_Module_ID_sorted$X20)[i:(i)], ], 
               aes(x =  Timepoints , y = value , group = Genes)) + 
          geom_point() + 
          geom_line(alpha = 1 , aes(col = Genes)) + 
          facet_wrap(~ X20) +
          scale_y_continuous(name = "-Delta Ct")+
          scale_x_discrete(name = "Timepoints", limits=c("1"= "T1", "2" = "T2",  "3" = "T3",  "5"= "T5", "6" = "T6", "7" = "T7", "8"= "T8", "9" = "T9",  "10" = "T10", "11"= "T11", "12" = "T12",  "13" = "T13", "14"= "T14", "15" = "T15", "16" = "T16", "17"= "T17", "18" = "T18", "19" = "T19", "20" = "T20"))+
          theme_classic()+
          theme(legend.position = "right") +
          theme(plot.title = element_text(lineheight=.8,size =14,face = "bold"),
                axis.text.x = element_text(colour="black",size=4.5,angle=0,hjust=0.5,vjust=0.5,face="plain"),
                axis.text.y = element_text(colour="black",size=4,angle=0,hjust=0,vjust=0.5,face="plain"),  
                axis.title.x = element_text(colour="black",size=15,angle=0,hjust=.5,vjust=0,face="plain"),
                axis.title.y = element_text(colour="black",size=14,angle=90,hjust=.5,vjust=.5,face="plain"),
                strip.background = element_blank(),
                legend.position = "right"))
}
dev.off()

Thank you,

Toufiq

ADD COMMENTlink modified 10 months ago by zx87549.7k • written 10 months ago by mohammedtoufiq91110
1

Hello mohammedtoufiq91!

It appears that your post has been cross-posted to another site: https://support.bioconductor.org/p/126702/

This is typically not recommended as it runs the risk of annoying people in both communities.

ADD REPLYlink written 10 months ago by ATpoint40k

@ ATpoint,

Apologies for the confusion. This would not be repeated going forward.

ADD REPLYlink written 10 months ago by mohammedtoufiq91110

Can you post a snippet of the data used to generate that graph? For such an issue a reproducible example is required to properly troubleshoot.

ADD REPLYlink written 10 months ago by Mark800
3
gravatar for zx8754
10 months ago by
zx87549.7k
London
zx87549.7k wrote:

We need to create data for dashed lines, then plot, see example, based on your data. I subsetted it for one facet for simplicity.

library(ggplot2)
library(dplyr)

# example data for one facet
Test_v1_ID$Timepoints <- factor(as.numeric(Test_v1_ID$Timepoints), levels = 1:20, labels = paste0("T", 1:20))
Test_v1_ID$value <- as.numeric(Test_v1_ID$value)
d <- Test_v1_ID[ Test_v1_ID$X12 == "M10.1", ]

# data for dashed lines
dash <- d %>% 
  arrange(Genes, Timepoints) %>% 
  group_by(Genes) %>% 
  mutate(x1 = if_else( is.na(value), lag(Timepoints), factor(NA)),
         x2 = if_else( is.na(value), lead(Timepoints), factor(NA)),
         y1 = if_else( is.na(value), lag(value), NA_real_),
         y2 = if_else( is.na(value), lead(value), NA_real_)) %>% 
  filter(!is.na(y1) & !is.na(y2))

# plot as before, and add dashed lines as segments
ggplot(d, aes(x = Timepoints, y = value, group = Genes, col = Genes)) + 
  geom_point() + 
  geom_line() +
  geom_segment(aes(x = x1, xend = x2, 
                   y = y1, yend = y2 ), data = dash, linetype = "dashed")
ADD COMMENTlink modified 10 months ago • written 10 months ago by zx87549.7k

Hi @zx8754,

I was trying a test run with your R code to understand further, however I keep getting the syntax errors. Am I missing something here.

dash <- d %>% 
  arrange(Genes, Timepoints) %>% 
  group_by(Genes) %>% 
  mutate(x1 = if_elseis.na(value), lag(Timepoints), factor(NA)),
x2 = if_elseis.na(value), lead(Timepoints), factor(NA)),
y1 = if_elseis.na(value), lag(value), NA_real_),
y2 = if_elseis.na(value), lead(value), NA_real_)) %>% 
  filter(!is.na(y1) & !is.na(y2))


dash <- d %>% 
+   arrange(Genes, Timepoints) %>% 
+   group_by(Genes) %>% 
+   mutate(x1 = if_elseis.na(value), lag(Timepoints), factor(NA)),
Error: unexpected ',' in:
"  group_by(Genes) %>% 
  mutate(x1 = if_elseis.na(value), lag(Timepoints), factor(NA)),"
> x2 = if_elseis.na(value), lead(Timepoints), factor(NA)),
Error: unexpected ',' in "x2 = if_elseis.na(value),"
> y1 = if_elseis.na(value), lag(value), NA_real_),
Error: unexpected ',' in "y1 = if_elseis.na(value),"
> y2 = if_elseis.na(value), lead(value), NA_real_)) %>% 
Error: unexpected ',' in "y2 = if_elseis.na(value),"
>   filter(!is.na(y1) & !is.na(y2))
Error in filter(!is.na(y1) & !is.na(y2)) : object 'y1' not found
> 




str(d)
'data.frame':   44 obs. of  4 variables:
 $ Genes     : Factor w/ 8 levels "Gene_A","Gene_B",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ Timepoints: Factor w/ 20 levels "T1","T2","T3",..: 1 2 8 9 10 11 12 16 17 18 ...
 $ value     : num  -1.556 -3.085 -0.721 -0.901 NA ...
 $ X12       : Factor w/ 2 levels "M10.1","M10.2": 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "names")= chr  "Gene_A" "Gene_A" "Gene_A" "Gene_A" ...
ADD REPLYlink modified 10 months ago • written 10 months ago by mohammedtoufiq91110
1

Try again.

(biostars website sometimes doesn't render code properly, and some parenthesis disappear, should be ok, now.)

ADD REPLYlink written 10 months ago by zx87549.7k

@ zx8754 ,

Excellent. Looks great! Thank you.

In my case, I should specify each ID manually as I have 100 of them as below

d <- Test_v1_ID[ Test_v1_ID$X12 == "M10.1", "M10.2", "M10.3", "M10.4", "M10.9", ......... ]

or just any easier method?

ADD REPLYlink modified 10 months ago • written 10 months ago by mohammedtoufiq91110
1

We will need to use loops.

  • Please avoid asking new questions in the comments, if it is a new question then post a question.
  • Try searching the webs for simple R problems.
ADD REPLYlink modified 10 months ago • written 10 months ago by zx87549.7k

@ zx8754 ,

Thank you very much for the assistance.

ADD REPLYlink written 10 months ago by mohammedtoufiq91110
1
gravatar for zx8754
10 months ago by
zx87549.7k
London
zx87549.7k wrote:

Please provide example data.

Just guessing try to change your timepoints into factors, something like:

B1_Patient_Module_ID_sorted$Timepoints <- factor(B1_Patient_Module_ID_sorted$Timepoints,
                                                 levels = 1:20,
                                                 labels = paste0("T", 1:20))

Then we do not need to define the scale_x_discrete(...).

ADD COMMENTlink modified 10 months ago • written 10 months ago by zx87549.7k

@ zx8754,

Excellent. This fixed the issue.

Another question, as shown in the figure timepoint T10, there is no data point which is because of the missing values for that particular timepoint, is there a way to display or represent it by dotted line as there are many such line plots or should it removed for better representation.. For instance, in the Complexheatmap or pheatmap, we say na_col = grey for missing values.

ADD REPLYlink written 10 months ago by mohammedtoufiq91110
2

Yes, it is possible.

Provide example data

ADD REPLYlink written 10 months ago by zx87549.7k

@ zx8754 ,

thank you. Here is the data.

dput(Test_v1_ID)
structure(list(Genes = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 
3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 4L, 
4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 5L, 
6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 6L, 7L, 7L, 7L, 7L, 7L, 
7L, 7L, 7L, 7L, 7L, 7L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 8L, 
8L), .Label = c("Gene_A", "Gene_B", "Gene_C", "Gene_D", "Gene_D.1", 
"Gene_E", "Gene_F", "Gene_G"), class = "factor"), Timepoints = c("1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19", "1", 
"2", "8", "9", "10", "11", "12", "16", "17", "18", "19"), value = c("-1.55598", 
"-3.08452", "-0.720558", "-0.901471", NA, "-1.73362", "-1.27953", 
"0.147734", "-0.31916", "-0.48834", "-1.70071", NA, NA, NA, NA, 
NA, NA, NA, NA, "-17.0921", NA, NA, "-1.34066", "-3.05158", "-0.359577", 
"-0.921044", NA, "-1.71127", "-0.954832", "-0.44804", "-0.58607", 
"0.151555", "-0.656842", "-4.6299", "-5.97264", "-4.11533", "-4.24868", 
NA, "-4.26154", "-3.52369", "-2.58611", "-2.98512", "-2.37213", 
"-3.57149", "-2.05066", "-0.657222", "-1.40576", "-2.29293", 
"-0.509917", "-1.68802", NA, "-1.85783", "-1.9242", NA, "-2.33469", 
"-8.35787", "-9.52402", "-9.55285", "-9.5344", "-9.23144", "-9.94065", 
NA, "-8.96788", "-9.01785", "-9.17554", "-9.90749", "-2.06287", 
"-0.846725", "-1.08125", "-1.7152", "-2.01096", "-2.07493", NA, 
"-1.41699", "-1.471", NA, "-1.67149", "-3.83545", "-1.19723", 
"-1.78817", "-1.78302", NA, "-1.11688", NA, "-1.88749", "-2.20363", 
NA, "-1.79198"), X12 = structure(c(Gene_A = 1L, Gene_A = 1L, 
Gene_A = 1L, Gene_A = 1L, Gene_A = 1L, Gene_A = 1L, Gene_A = 1L, 
Gene_A = 1L, Gene_A = 1L, Gene_A = 1L, Gene_A = 1L, Gene_B = 1L, 
Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, 
Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, Gene_B = 1L, 
Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, 
Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, Gene_C = 1L, 
Gene_C = 1L, Gene_D = 1L, Gene_D = 1L, Gene_D = 1L, Gene_D = 1L, 
Gene_D = 1L, Gene_D = 1L, Gene_D = 1L, Gene_D = 1L, Gene_D = 1L, 
Gene_D = 1L, Gene_D = 1L, Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_D.1 = 2L, 
Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_D.1 = 2L, 
Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_D.1 = 2L, Gene_E = 2L, Gene_E = 2L, 
Gene_E = 2L, Gene_E = 2L, Gene_E = 2L, Gene_E = 2L, Gene_E = 2L, 
Gene_E = 2L, Gene_E = 2L, Gene_E = 2L, Gene_E = 2L, Gene_F = 2L, 
Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, 
Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, Gene_F = 2L, 
Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, 
Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, Gene_G = 2L, 
Gene_G = 2L), .Label = c("M10.1", "M10.2"), class = "factor")), row.names = c(NA, 
-88L), class = "data.frame")
ADD REPLYlink modified 10 months ago • written 10 months ago by mohammedtoufiq91110
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1848 users visited in the last hour