Generate a bar plot in R
1
1
Entering edit mode
5.4 years ago
libya.tahani ▴ 20

Hi, How could I generate a bar plot in R like that I found in GEO2R profile graph to show the expression values of the gene across Samples ( comparing between two cell lines ) ?

My code for example ( I plane to use this data which I loaded it from GEO datasets) :-

library(Biobase)
library(GEOquery)
library(limma)

# load series and platform data from GEO

gset <- getGEO("GSE7700", GSEMatrix =TRUE, AnnotGPL=TRUE)
if (length(gset) > 1) idx <- grep("GPL570", attr(gset, "names")) else idx <- 1
gset <- gset[[idx]]

# make proper column names to match toptable 
fvarLabels(gset) <- make.names(fvarLabels(gset))

# group names for all samples
gsms <- "1111000000"
sml <- c()
for (i in 1:nchar(gsms)) { sml[i] <- substr(gsms,i,i) }

# log2 transform
ex <- exprs(gset)
qx <- as.numeric(quantile(ex, c(0., 0.25, 0.5, 0.75, 0.99, 1.0), na.rm=T))
LogC <- (qx[5] > 100) ||
          (qx[6]-qx[1] > 50 && qx[2] > 0) ||
          (qx[2] > 0 && qx[2] < 1 && qx[4] > 1 && qx[4] < 2)
if (LogC) { ex[which(ex <= 0)] <- NaN
  exprs(gset) <- log2(ex) }

# set up the data and proceed with analysis
sml <- paste("G", sml, sep="")    # set group names
fl <- as.factor(sml)
gset$description <- fl
design <- model.matrix(~ description + 0, gset)
colnames(design) <- levels(fl)
fit <- lmFit(gset, design)
cont.matrix <- makeContrasts(G1-G0, levels=design)
fit2 <- contrasts.fit(fit, cont.matrix)
fit2 <- eBayes(fit2, 0.01)
tT <- topTable(fit2, adjust="fdr", sort.by="B", number=250)

tT <- subset(tT, select=c("ID","adj.P.Val","P.Value","t","B","logFC","Gene.symbol","Gene.title"))
write.table(tT, file=stdout(), row.names=F, sep="\t")

I want to generate a bar plot for each gene individual like the image that I attached here. Profile Graph

R gene • 3.2k views
ADD COMMENT
0
Entering edit mode

Please use the formatting bar (especially the code option) to present your post better. I've done it for you this time.
code_formatting

ADD REPLY
4
Entering edit mode
5.4 years ago

Starting with a data frame called d.

Make a column called sample that assigns each row/sample to a sample name GSM1234, etc.

Make a column called treatment, which is a categorical factor that assigns each row's sample to normal or cancer.

Make a column called expression, which contains the expression value (log-transformed, etc.) for the row/sample (or for its gene, if that's the correct interpretation of what you're plotting).

#!/usr/bin/env Rscript

library(ggplot2)

# set up `d` per your code
# add `sample`, `treatment`, and `expression` columns per this answer

out_fn <- 'figure.pdf'

pdf(out_fn, onefile=F, width=7, height=5)
p <- ggplot(d, aes(group=sample)) + 
  geom_bar(aes(x=sample, y=expression, fill=treatment), width=0.75, position="dodge2", stat="identity") + 
  facet_grid(~treatment) +
  scale_colour_brewer(palette="Set1") + 
  scale_fill_brewer(palette="Set1") +
  ggtitle('Gene expression vs Sample') +
  xlab('Samples') +
  ylab('Expression') +
  theme(plot.title = element_text(size = 12, hjust = 0.5)) +
  theme(axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.5))
print(p)
dev.off()
ADD COMMENT
0
Entering edit mode

If you're using Rscript, you're better off using ggsave(p, device='pdf') than pdf(); print(p); dev.off(); IMO.

ADD REPLY
0
Entering edit mode

I'm editing the answer to change gene to sample (or cell line), which on a second read seems more appropriate for answering the question.

ADD REPLY

Login before adding your answer.

Traffic: 2338 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6