You should transform your data to a log-like scale. If you're analysing in DESeq2, look at `vst`

or `rlog`

methods, alternatively if you're using `Limma Voom`

, then your data should be good to go. Have a look at the tximport package if you're confused about these different input metrics.

When you've got your data in the correct scale, here's a nice bit of code to produce a PCA - note I'm using dummy data in this case.

```
library(tidyverse) #CRAN - install.packages("tidyverse")
library(ggrepel) #CRAN - install.packages("ggrepel")
# Generate some fake data
set.seed(73)
mat.row <- 1000
mat.col <- 15
data.pheno <- data.frame(SampleID = paste0("SAM", 1:mat.col),
SampleType = rep(c("A","B","C"), times = mat.col / 3),
stringsAsFactors = F)
foo <- rnorm(mat.row * mat.col, mean = 300) %>%
log2 %>%
matrix(., ncol = mat.col) %>%
`colnames<-`(data.pheno$SampleID)
#
# Generate PCA Data & Proportion of variability
pca <- foo %>% t %>% prcomp
d <- pca$x %>% as.data.frame %>%
add_rownames("SampleID") %>%
left_join(data.pheno)
pcv <- round((pca$sdev)^2 / sum(pca$sdev^2)*100, 2)
#
# Make a pretty Picture
plot.pca <- ggplot(d, aes(PC1,PC2,colour = SampleType)) +
geom_point() +
xlab(label=paste0("PC1 (", pcv[1], "%)")) +
ylab(label=paste0("PC2 (", pcv[2], "%)")) +
theme_bw() +
geom_label_repel(aes(label = SampleType), show.legend = F) +
theme(axis.title.x = element_text(size=15),
axis.title.y = element_text(size=15)) +
labs(title = "My Fake PCA",
subtitle = "With some random data",
caption = "Coloured by my random phenotype")
print(plot.pca)
#
```

I think you can do so

library(scater)

You can do many things here

https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette-dataviz.html#generating-pca-plots

40