Question: How to create a PCA Plot of Proteomics Data in R?
0
gravatar for ishackm
18 months ago by
ishackm100
ishackm100 wrote:

Hi all,

I hope you're well

I have the following the dataset:

QE1_Jo_Exp1_AOCS1_R QE1_Jo_Exp1_G33 QE1_Jo_Exp1_G33_R QE1_Jo_Exp1_G164
1           1027.9600       1434.3834         1774.4618         892.7630
2           1075.0975       1692.0633         1014.8056         537.9152
3           1031.2545       1377.9725         1181.1430        3983.6936
4           3257.5661       3433.5130         3644.4593         933.2016
5            535.0528        839.5253          523.3276        3708.1248
6           6259.4604      23886.0483         9353.2122       29776.4997

I used the following script to carry out PCA analysis:

a <- myPr

rv <- rowVars(as.matrix(a))

select <- order(rv, decreasing = TRUE)[seq_len(min(ntop = 12596, length(rv)))]

pca1 <- prcomp(t(a[select, ]))

scores <- data.frame(pca1$x[,1:ncol(pca1$rotation)])

scores.df <- data.frame(colnames(a), pca1$x[,1:ncol(pca1$rotation)])

pca1
summary(pca1)

The result is:

> summary(pca1)
Importance of components:
                             PC1       PC2       PC3       PC4       PC5       PC6       PC7       PC8       PC9      PC10      PC11
Standard deviation     5.441e+05 1.097e+05 6.167e+04 1.720e+04 1.293e+04 9.306e+03 8.535e+03 7.128e+03 4.357e+03 3.526e+03 2.666e+03
Proportion of Variance 9.470e-01 3.853e-02 1.217e-02 9.500e-04 5.300e-04 2.800e-04 2.300e-04 1.600e-04 6.000e-05 4.000e-05 2.000e-05
Cumulative Proportion  9.470e-01 9.855e-01 9.977e-01 9.987e-01 9.992e-01 9.995e-01 9.997e-01 9.999e-01 9.999e-01 1.000e+00 1.000e+00
                            PC12      PC13      PC14      PC15     PC16      PC17      PC18      PC19      PC20      PC21      PC22
Standard deviation     2.349e+03 1.851e-06 2.318e-07 2.279e-07 2.25e-07 2.173e-07 2.151e-07 2.148e-07 2.081e-07 2.065e-07 2.002e-07
Proportion of Variance 2.000e-05 0.000e+00 0.000e+00 0.000e+00 0.00e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.00e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00
                            PC23      PC24
Standard deviation     1.942e-07 4.214e-11
Proportion of Variance 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00

But when I do:

biplot(pca1)

I get the following

the current PCA plot current PCA plot:

the desired PCA plot Desired PCA plot

This is my first time doing a PCA plot so any help will be greatly appreciated.

Many Thanks,

Ishack

pca R proteomics • 1.1k views
ADD COMMENTlink modified 18 months ago • written 18 months ago by ishackm100

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

ADD REPLYlink written 18 months ago by RamRS30k

Are your plots from the same dataset? How do you know that such a plot is possible with your dataset? They're both PC1 vs PC2 plots, maybe the nature of your data prevents the plot from being like #2?

ADD REPLYlink written 18 months ago by RamRS30k

Hi RamRS,

Thank you for your quick response

the second plot is from a different dataset but I would like to have the first dataset to have a plot similar to the second PCA plot, please

ADD REPLYlink written 18 months ago by ishackm100
1

Check @Kevin's PCAtools package: PCA plot from read count matrix from RNA-Seq While this refers to RNAseq the principle should be the same.

ADD REPLYlink modified 18 months ago • written 18 months ago by genomax91k

Hi genomax, thanks for the link

I have created the following plot from this code:

library(factoextra)

fviz_eig(pca1)

fviz_pca_ind(pca1,
             col.ind = "cos2", # Color by the quality of representation
             gradient.cols = c("#00AFBB", "#E7B800", "#FC4E07"),
             repel = TRUE     # Avoid text overlapping)

The new PCA plot enter image description here

is Dim 1 and Dim 2 same as PCA 1 and PCA 2?

ADD REPLYlink modified 18 months ago • written 18 months ago by ishackm100

As genomax says, you can just use my code from my other thread: PCA plot from read count matrix from RNA-Seq

Also PCAtools (https://bioconductor.org/packages/release/bioc/html/PCAtools.html) can be used - this was just released with Bioconductor 3.9

ADD REPLYlink written 18 months ago by Kevin Blighe66k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 910 users visited in the last hour