Question: DEseq2 Shade PCA plot by gene counts
gravatar for paulranum11
5 months ago by
paulranum1160 wrote:

Hi All,

I am working with DEseq2 to do some RNA-Seq analysis and would like to find a solution for displaying the expression level of individual genes on the PCA plot. This functionality exists built into other packages like Seurat (using "FeaturePlot"). Does anyone know of a way to implement this with DEseq2's plotPCA.

Here is the simple command i am using to generate the PCA plot.

plotPCA(vsd, intgroup=c("Treatment"))

The following is a link to a DEseq2 walkthrough for reference:

heatmap deseq2 pca ggplot2 • 434 views
ADD COMMENTlink modified 5 months ago by Kevin Blighe56k • written 5 months ago by paulranum1160
gravatar for Kevin Blighe
5 months ago by
Kevin Blighe56k
Kevin Blighe56k wrote:


The DESeq2 PCA function will [I believe] convert a continuous variable to categorical, which is likely not what you want. This is fine, as DESeq2 is primarily about normalising RNA-seq data and conducting differential expression.

In any case, you can do this with my own package, PCAtools, which should fit nicely with your DESeq2 variance-stabilised expression values and metadata. Here is a reproducible example, taken from my vignette: PCAtools: everything Principal Component Analysis:

1, load breast cancer data from GEO


# load series and platform data from GEO
gset <- getGEO('GSE2990', GSEMatrix = TRUE, getGPL = FALSE)
x <- exprs(gset[[1]])

# remove Affymetrix control probes
x <- x[-grep('^AFFX', rownames(x)),]

# extract information of interest from the phenotype data (pdata)
idx <- which(colnames(pData(gset[[1]])) %in%
  c('age:ch1', 'distant rfs:ch1', 'er:ch1',
    'ggi:ch1', 'grade:ch1', 'size:ch1',
    'time rfs:ch1'))

metadata <- data.frame(pData(gset[[1]])[,idx],
  row.names = rownames(pData(gset[[1]])))

# tidy column names
colnames(metadata) <- c('Age', 'Distant.RFS', 'ER', 'GGI', 'Grade',
  'Size', 'Time.RFS')

# prepare certain phenotypes
metadata$Age <- as.numeric(gsub('^KJ', NA, metadata$Age))
metadata$Distant.RFS <- factor(metadata$Distant.RFS, levels=c(0,1))
metadata$ER <- factor(gsub('\\?', NA, metadata$ER), levels=c(0,1))
metadata$ER <- factor(ifelse(metadata$ER == 1, 'ER+', 'ER-'), levels = c('ER-', 'ER+'))
metadata$GGI <- as.numeric(metadata$GGI)
metadata$Grade <- factor(gsub('\\?', NA, metadata$Grade), levels=c(1,2,3))
metadata$Grade <- gsub(1, 'Grade 1', gsub(2, 'Grade 2', gsub(3, 'Grade 3', metadata$Grade)))
metadata$Grade <- factor(metadata$Grade, levels = c('Grade 1', 'Grade 2', 'Grade 3'))
metadata$Size <- as.numeric(metadata$Size)
metadata$Time.RFS <- as.numeric(gsub('^KJX|^KJ', NA, metadata$Time.RFS))

# remove samples from the pdata that have any NA value
discard <- apply(metadata, 1, function(x)
metadata <- metadata[!discard,]

# filter the expression data to match the samples in our pdata
x <- x[,which(colnames(x) %in% rownames(metadata))]

# check that sample names match exactly between pdata and expression data 
all(colnames(x) == rownames(metadata))

2, add ESR1 gene expression to the metadata

metadata$ESR1 <- x["205225_at",]

         Age Distant.RFS  ER       GGI   Grade Size Time.RFS      ESR1
GSM65752  40           0 ER-  2.480050 Grade 3  1.2     2280  5.893152
GSM65753  46           0 ER+ -0.633592 Grade 1  1.3     2675 11.419733
GSM65755  41           1 ER+  1.043950 Grade 3  3.3      182 10.922990
GSM65757  34           0 ER+  1.059190 Grade 2  1.6     3952  9.629512
GSM65758  46           1 ER+ -1.233060 Grade 2  2.1     1824 11.367321
GSM65760  57           1 ER+  0.679034 Grade 3  2.2      699 10.161146

3, run PCA

p <- pca(x, metadata = metadata, removeVar = 0.1)

Here, your x object would be assay(vsd)

4, generate bi-plot with continuous scale

    lab = NULL,
    colby = 'ESR1',
    shape = 'ER',
    hline = 0, vline = 0,
    legendPosition = 'right') +

scale_colour_gradient(low = "gold", high = "red2")


Good validation in its own right, as those with higher ESR1 gene expression are also ER-positive by IHC / histology...


ADD COMMENTlink modified 5 months ago • written 5 months ago by Kevin Blighe56k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1465 users visited in the last hour