How to get PCA3 and PCA4 from Deseq2?
5
0
Entering edit mode
5.7 years ago
mkh ▴ 60

Hi All,

I am plotting PCA via pcaplot in Deseq2. I am interested to look at other pca plots like pca3 vs. pca4 as well. How could I have this information in Deseq2?

Thanks,

RNA-seq • 9.5k views
ADD COMMENT
4
Entering edit mode
5.7 years ago
h.mon 35k

Straight from plotPCA code:

dds <- rlog( dds, blind = T )
rv <- rowVars(assay(dds))
# select the ntop genes by variance
select <- order(rv, decreasing=TRUE)[seq_len(min(ntop, length(rv)))]
# perform a PCA on the data in assay(x) for the selected genes
pca <- prcomp(t(assay(dds)[select,]))

Now you can access more components:

PC1=pca$x[,1]
PC2=pca$x[,2]
PC3=pca$x[,3]

See the pcaPlot source for more ideas.

ADD COMMENT
1
Entering edit mode
5.7 years ago

Here is my version of plotPCA

## Modified plotPCA from DESeq2 package. Shows the Names of the Samples (the first col of SampleTable), and uses ggrepel pkg to plot them conveniently.
# @SA 10.02.2017 
library(genefilter)
library(ggplot2)
library(ggrepel)

plotPCA.san <- function (object, intgroup = "condition", ntop = 500, returnData = FALSE) 
{
  rv <- rowVars(assay(object))
  select <- order(rv, decreasing = TRUE)[seq_len(min(ntop, 
                                                     length(rv)))]
  pca <- prcomp(t(assay(object)[select, ]))
  percentVar <- pca$sdev^2/sum(pca$sdev^2)
  if (!all(intgroup %in% names(colData(object)))) {
    stop("the argument 'intgroup' should specify columns of colData(dds)")
  }
  intgroup.df <- as.data.frame(colData(object)[, intgroup, drop = FALSE])
  group <- if (length(intgroup) > 1) {
    factor(apply(intgroup.df, 1, paste, collapse = " : "))
  }
  else {
    colData(object)[[intgroup]]
  }

  ## Select the PCAs and percentVar that you like instead of 1 and 2
  d <- data.frame(PC1 = pca$x[, 1], PC2 = pca$x[, 2], group = group, 
                  intgroup.df, name = colData(rld)[,1])
  if (returnData) {
    attr(d, "percentVar") <- percentVar[1:2]
    return(d)
  }
    ggplot(data = d, aes_string(x = "PC1", y = "PC2", color = "group", label = "name")) + geom_point(size = 3) + xlab(paste0("PC1: ", round(percentVar[1] * 100), "% variance")) + ylab(paste0("PC2: ", round(percentVar[2] * 100), "% variance")) + coord_fixed() + geom_text_repel(size=3) 

}

Select the PCs and percentVars you like by changing the argument below and also later in the ggplot (x="PC3" etc..)

  ## Select the PCAs and percentVar that you like instead of 1 and 2
  d <- data.frame(PC1 = pca$x[, 1], PC2 = pca$x[, 2], group = group, 
                  intgroup.df, name = colData(rld)[,1])
  if (returnData) {
    attr(d, "percentVar") <- percentVar[1:2]
    return(d)
  }
ADD COMMENT
1
Entering edit mode
5.7 years ago

Yet more possibilities via base R functions: A: PCA plot from read count matrix from RNA-Seq

DESeq2's PCA functionality automatically filters out a bunch of your transcripts based on low variance (biased / supervised). The code to which I have linked you does not (unbiased / unsupervised).

Kevin

ADD COMMENT
1
Entering edit mode
5.7 years ago

May be you can store plotPCA output (with option returnData=TRUE) in an object and plot PCs independently with qplot.

ADD COMMENT
0
Entering edit mode

plotPCA( ..., returnData=TRUE) returns just the first two PCs, hence the need to use prcomp - as is done internally by plotPCA - or some other function to perform the PCA.

ADD REPLY
0
Entering edit mode

you are correct @ h.mon. It stores only pc1 and 2 (https://rdrr.io/bioc/DESeq2/src/R/plots.R) or getMethod("plotPCA","DESeqTransform")

ADD REPLY
0
Entering edit mode
5.7 years ago
mkh ▴ 60

Thanks, everyone. For those who are interested in plotting the PCA component 3, 4 and ... the following lines could help too.

library("DESeq2")
library(ggfortify)

project.pca <- prcomp(t(assay(dds)[select,]))

pdf("PCA34.pdf")

autoplot(prcomp(t(assay(dds)[select,])), x=3, y=4, data=sampleTable ,colour = 'genotype')

dev.off()

pdf("PCA56.pdf")

autoplot(prcomp(t(assay(dds)[select,])), x=5, y=6, data=sampleTable ,colour = 'genotype')

dev.off()

you can also add shape to the plot.

autoplot(prcomp(t(assay(dds)[select,])), data=sampleTable ,colour = 'genotype', shape='diet', size=3)

You can look at here for further discussion.

Have fun!

ADD COMMENT
0
Entering edit mode

Since there are multiple answers posted for your original question you should take the time to test them and accept all answers that are correct. You are able to accept more than one answer.
Upvote|Bookmark|Accept

ADD REPLY

Login before adding your answer.

Traffic: 2758 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6