R ggrepel for overlapping text in the whole plot
1
0
Entering edit mode
11 months ago
camillab. ▴ 160

Hi!

I am trying to make a scatterplot with ggplot . I have subset my point into 4 groups based on p.value and range of log2fc. I used the geom_text_repel to avoid overlapping text and it works within the same "group" but among different groups some text overlaps.

here my code (apologies if your eyes are bleeding):

#scatterplot
library(ggplot2)
library(ggrepel)

#with pvalue vbut less logf2fc highlighted
dfup <- text %>%  filter(log2FC >= 1.5 & p_value <= 0.05) #ok
pvalup<- text %>%  filter(p_value <= 0.05 &  log2FC > 0 & log2FC <= 1.4)
pvaldown<- text %>%  filter(p_value <= 0.05 & log2FC >= -1.4 & log2FC < 0)
dfdown <- text %>%  filter(log2FC <= -1.5 & p_value <= 0.05) #ok


ggplot(data = text, aes(x= log2FC, y= -1*log10(p_value))) + 
  geom_point() + 
  geom_point(data= pvaldown, aes(x= log2FC, y= -1*log10(p_value)), color = "dodger blue")+
  geom_point(data= pvalup, aes(x= log2FC, y= -1*log10(p_value)), color = "magenta")+
  geom_point(data= dfup, aes(x= log2FC, y= -1*log10(p_value)), color = "red")+
  geom_point(data= dfdown, aes(x= log2FC, y= -1*log10(p_value)), color = "blue")+
  geom_vline(xintercept = 1.5,linetype = "dashed", color = "dim gray")+ 
  geom_vline(xintercept = -1.5,linetype = "dashed", color = "dim gray")+
  geom_hline(yintercept = -1*log10(0.05),linetype = "dashed", color = "dim gray") +
  geom_hline(yintercept = -1*log10(0.01),linetype = "dotted", color = "gray")+ 
  geom_text(mapping=aes(x=-10,y=2.0),label=paste("0.01"), color = "gray", size = 2.5, vjust=-0.5)+
  geom_hline(yintercept = -1*log10(0.001),linetype = "dotted", color = "gray") +
  geom_text(mapping=aes(x=-10,y=3.0),label=paste("0.001"), color = "gray", size = 2.5, vjust=-0.5)+
  geom_text_repel(data=dfdown, aes(label=`Associated.Gene.Name`), color = "blue")+
  geom_text_repel(data=pvalup, aes(label=`Associated.Gene.Name`), color = "magenta")+
  geom_text_repel(data=pvaldown, aes(label=`Associated.Gene.Name`), color = "dodger blue")+
  geom_text_repel(data=dfup, aes(label=`Associated.Gene.Name`), color = "red")+
  theme_bw(base_size = 12)+ #size of the labelling +
  theme( title = element_text(hjust = 0.5), axis.title = element_text(color = "black"),panel.grid.major = element_blank(),
         panel.grid.minor = element_blank())

plot

what I can do to avoid the overlapping of the plots?

Thank you!

Camilla

overlapping r ggrepel ggplot • 1.0k views
ADD COMMENT
1
Entering edit mode

If you are using multiple geom_text_repel in one plot you can try to set a specific direction for each set so they don’t collide. Check this Stackoverflow post to see if it answers your question.

ADD REPLY
0
Entering edit mode

Could you share a small example of your dataset ? dput(head(your_df)) would be great to reproduce your issue

ADD REPLY
0
Entering edit mode

sure. row are the genes and the columns the samples.

# A tibble: 6 × 12
  ID        Descr…¹ Assoc…² 2-3dp…³ `8dpg` `14dpg` `18dpg` hu1_u…⁴ hu2_u…⁵ hu3_u…⁶ p_value
  <chr>     <chr>   <chr>     <dbl>  <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
1 ENSG0000… tetras… TSPAN6  1.31e-5   6.94 4.39e-6   2.75    0.536    1.18   0.827 0.455  
2 ENSG0000… dolich… DPM1    1.97e+2  33.2  3.73e+1  29.6    24.1     22.9   30.0   0.362  
3 ENSG0000… SCY1-l… SCYL3   6.08e+0   4.86 5.73e+0   6.24    2.02     3.30   2.84  0.00154
4 ENSG0000… chromo… C1ORF1… 7.01e+0   2.35 2.99e+0   3.76    1.30     1.19   1.52  0.0801 
5 ENSG0000… feline… FGR     6.37e-1   2.81 1.17e+0   1.54    5.08     1.27   0.403 0.613  
6 ENSG0000… comple… CFH     5.13e-4   2.12 4.67e-2   0.753  17.8      2.45   3.82  0.139  
# … with 1 more variable: log2FC <dbl>, and abbreviated variable names ¹​Description,
#   ²​Associated.Gene.Name, ³​`2-3dpg`, ⁴​hu1_untreated, ⁵​hu2_untreated, ⁶​hu3_untreated
# ℹ Use `colnames()` to see all variable names
ADD REPLY
7
Entering edit mode
11 months ago
Trivas ★ 1.7k

I'd recommend combining all of your data into one dataframe/tibble, then adding a column to denote their category. You can then use color = category within the aes() call of geom_point. This will also fix your geom_text_repel issue since the call to that function will be able to see all text labels and not just the one within each separate call.

For example:

#with pvalue but less logf2fc highlighted
dfup <- text %>%  filter(log2FC >= 1.5 & p_value <= 0.05) #ok
pvalup<- text %>%  filter(p_value <= 0.05 &  log2FC > 0 & log2FC <= 1.4)
pvaldown<- text %>%  filter(p_value <= 0.05 & log2FC >= -1.4 & log2FC < 0)
dfdown <- text %>%  filter(log2FC <= -1.5 & p_value <= 0.05) #ok

Could become

text %>% mutate(category = case_when(log2FC >= 1.5 & p_value <= 0.05 ~ "dfup",
                                     p_value < 0.05 & log2FC > 0 & log2FC <= 1.4 ~ "pvalup",

etc.

ADD COMMENT

Login before adding your answer.

Traffic: 1613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6