Loop through columns to generate PCA from DESeq2 data
1
0
Entering edit mode
2.1 years ago

I'd like to generate a PCA of my bulk RNAseq data, coloured by each of my variables in the DESeq2 object "vsd". My current code looks like this (to generate a single plot):

pcaData <- plotPCA(vsd, intgroup=c("Age", "BlastRate"), returnData=TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))
ggplot(pcaData, aes(PC1, PC2, color=Age, shape=BlastRate)) +
  geom_point(size=3) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) +
  ylab(paste0("PC2: ",percentVar[2],"% variance")) +
  geom_text(aes(label=name),hjust=-.2, vjust=0) +
  ggtitle("Principal Component Analysis")

PCA Can anyone suggest a method to loop through and swap "Age" with the other variable columns of vsd?

>head(colData(vsd),1)
DataFrame with 1 row and 14 columns
        LibSize LibDiversity PercMapped      Age SpermStatus   SpConc    SpMot Subject.Group PairedSample FertRate
       <factor>     <factor>   <factor> <factor> <character> <factor> <factor>      <factor>     <factor> <factor>
sRNA_1      Low         High       High    42-46         unk      unk      unk     Male-Male            2      Low
       BlastRate RNABatch LibPrepBatch sizeFactor
        <factor> <factor>     <factor>  <numeric>
sRNA_1       Low        1            6   0.929408
pca ggplot2 deseq2 loop • 1.0k views
ADD COMMENT
2
Entering edit mode
2.1 years ago
Basti ★ 2.0k

If I understand correctly, you might try :

pcaData <- plotPCA(vsd, intgroup=c("Age", "BlastRate"), returnData=TRUE)
percentVar <- round(100 * attr(pcaData, "percentVar"))

for (k in colnames(pcaData)){
ggplot(pcaData, aes_string("PC1", "PC2", color=k, shape=BlastRate)) +
  geom_point(size=3) +
  xlab(paste0("PC1: ",percentVar[1],"% variance")) +
  ylab(paste0("PC2: ",percentVar[2],"% variance")) +
  geom_text(aes(label=name),hjust=-.2, vjust=0) +
  ggtitle("Principal Component Analysis")
}

You may change colnames(pcaData) to a list of variable you want to test.

ADD COMMENT
0
Entering edit mode

I tried your suggestion and got this error. I'm not sure why I get this error when running your loop but not when I run my code as above. Notably I'm working in Rstudio so each one of these code blocks is a chunk.

Error in aes_string("PC1", "PC2", color = k, shape = BlastRate) : 
object 'BlastRate' not found
ADD REPLY
0
Entering edit mode

Note that intgroup in the first line needs to contain every element of colData that you might want to include. So you may as put them all in; there's no reason not to.

Also note the difference between aes() and aes_string(), how one takes barewords, and one takes quoted strings.

ADD REPLY
0
Entering edit mode

Yes sorry, BlastRate should be quoted "BlastRate" as noticed by swbarnes2

ADD REPLY

Login before adding your answer.

Traffic: 1772 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6