Dear All,
i would like to compare differentially expressed genes between two condition of one cluster. I am not sure whether I put the level correctly. if I would like to compare between second arthritis to first arthritis, should I used ident.1 = "First arthritis_Activated CD4",
ident.2 = "Second arthritis_Activated CD4"
or ident.1 = "Second arthritis_Activated CD4",
ident.2 = "First arthritis_Activated CD4"? kindly help review my code below
.
# Identify differentially expressed genes across conditions
all.annotated.subsets$celltypes<-Idents(all.annotated.subsets)
all.annotated.subsets$arthritis_celltype <- paste(all.annotated.subsets$type, sep = "_", Idents(all.annotated.subsets))
Idents(all.annotated.subsets) <- "arthritis_celltype"
#Calculate differentially expressed genes between first and second arthritis!
prepsct<- PrepSCTFindMarkers(all.annotated.subsets)
DEG.activatedCD4<- FindMarkers(prepsct,assay = "SCT", ident.1 = "First arthritis_Activated CD4",
ident.2 = "Second arthritis_Activated CD4", min.pct = 0.25, logfc.threshold = 0.25, verbose = FALSE)
I next plot the volcano plot using the following code
volcano<- DEG.activatedCD4 [order(DEG.activatedCD4 $p_val_adj),]
# create data.frame
results<- as.data.frame(mutate(as.data.frame(volcano),
cut_off=ifelse(volcano$p_val_adj<0.05, "significant", "insignificant")),
row.names=rownames(volcano))
## ggplot plus ggrepel
options(ggrepel.max.overlaps = Inf)
library(ggrepel)
volcano$cut_off<- factor(results$cut_off, levels = c('significant', 'insignificant'))
# plot
ggplot(results, aes(avg_log2FC, -log10(p_val_adj))) +
theme_minimal() +
geom_point(aes(col = cut_off)) +
scale_color_manual(values = c("black", "red")) +
theme(plot.title = element_text(hjust = 0.5, size = 16, face = "bold"),
axis.title = element_text(face = "bold", size = 14),
axis.text = element_text(face = "bold", size=10),
legend.title = element_text(size = 11, face = "bold"),
legend.text = element_text(size = 9)) +
geom_text_repel(data=results[1:30,],
aes(label=rownames(results[1:30,]))) +
ggtitle("Second vs. first arthritis (Activated CD4)")
I got weird volcano plot as most of the gene aggregate at the bottom (kindly see figure attached I circled in blue). Is there way to improve it. Thanks for taking time to review.
Kind Regards,