Question: Is it possible to make a PCA plot for samples using TPM exprssion values in R?
0
gravatar for rimgubaev
6 months ago by
rimgubaev80
rimgubaev80 wrote:

I have a table with TPM expression values for several samples (10-12) and I want to create PCA plot in order to estimate similarity of raplicates of a certain conditions. If it possible could you please suggest some pipelines or commands for R?

rna-seq tpm pca • 420 views
ADD COMMENTlink modified 6 months ago by andrew.j.skelton735.5k • written 6 months ago by rimgubaev80

I think you can do so

library(scater)

example_sce <- SingleCellExperiment(
    assays = list(counts = matrix of your raw values))

cpm(example_sce) <- calculateCPM(example_sce)

example_sce <- normalize(example_sce)

plotPCA(example_sce)

You can do many things here

https://bioconductor.org/packages/devel/bioc/vignettes/scater/inst/doc/vignette-dataviz.html#generating-pca-plots

ADD REPLYlink written 6 months ago by jivarajivaraj40
7
gravatar for andrew.j.skelton73
6 months ago by
London
andrew.j.skelton735.5k wrote:

You should transform your data to a log-like scale. If you're analysing in DESeq2, look at vst or rlog methods, alternatively if you're using Limma Voom, then your data should be good to go. Have a look at the tximport package if you're confused about these different input metrics.

When you've got your data in the correct scale, here's a nice bit of code to produce a PCA - note I'm using dummy data in this case.

library(tidyverse) #CRAN - install.packages("tidyverse")
library(ggrepel)   #CRAN - install.packages("ggrepel")

# Generate some fake data
set.seed(73)
mat.row      <- 1000
mat.col      <- 15
data.pheno   <- data.frame(SampleID   = paste0("SAM", 1:mat.col),
                           SampleType = rep(c("A","B","C"), times = mat.col / 3),
                           stringsAsFactors = F)
foo          <- rnorm(mat.row * mat.col, mean = 300) %>% 
                log2 %>% 
                matrix(., ncol = mat.col) %>% 
                `colnames<-`(data.pheno$SampleID)
# 

# Generate PCA Data & Proportion of variability
pca          <- foo %>% t %>% prcomp
d            <- pca$x %>% as.data.frame %>% 
                add_rownames("SampleID") %>% 
                left_join(data.pheno) 
pcv          <- round((pca$sdev)^2 / sum(pca$sdev^2)*100, 2)
# 

# Make a pretty Picture
plot.pca    <- ggplot(d, aes(PC1,PC2,colour = SampleType)) +
               geom_point() +
               xlab(label=paste0("PC1 (", pcv[1], "%)")) +
               ylab(label=paste0("PC2 (", pcv[2], "%)")) +
               theme_bw() +
               geom_label_repel(aes(label = SampleType), show.legend = F) +
               theme(axis.title.x = element_text(size=15),
                     axis.title.y = element_text(size=15)) +
               labs(title    = "My Fake PCA",
                    subtitle = "With some random data",
                    caption  = "Coloured by my random phenotype")
print(plot.pca)
#

PCA Plot

ADD COMMENTlink modified 6 months ago • written 6 months ago by andrew.j.skelton735.5k
2

Very nice!

ADD REPLYlink written 6 months ago by Kevin Blighe39k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 942 users visited in the last hour