How to create a PCA Plot of Proteomics Data in R?
0
0
Entering edit mode
5.0 years ago
ishackm ▴ 110

Hi all,

I hope you're well

I have the following the dataset:

QE1_Jo_Exp1_AOCS1_R QE1_Jo_Exp1_G33 QE1_Jo_Exp1_G33_R QE1_Jo_Exp1_G164
1           1027.9600       1434.3834         1774.4618         892.7630
2           1075.0975       1692.0633         1014.8056         537.9152
3           1031.2545       1377.9725         1181.1430        3983.6936
4           3257.5661       3433.5130         3644.4593         933.2016
5            535.0528        839.5253          523.3276        3708.1248
6           6259.4604      23886.0483         9353.2122       29776.4997


I used the following script to carry out PCA analysis:

a <- myPr

rv <- rowVars(as.matrix(a))

select <- order(rv, decreasing = TRUE)[seq_len(min(ntop = 12596, length(rv)))]

pca1 <- prcomp(t(a[select, ]))

scores <- data.frame(pca1$x[,1:ncol(pca1$rotation)])

scores.df <- data.frame(colnames(a), pca1$x[,1:ncol(pca1$rotation)])

pca1
summary(pca1)


The result is:

> summary(pca1)
Importance of components:
PC1       PC2       PC3       PC4       PC5       PC6       PC7       PC8       PC9      PC10      PC11
Standard deviation     5.441e+05 1.097e+05 6.167e+04 1.720e+04 1.293e+04 9.306e+03 8.535e+03 7.128e+03 4.357e+03 3.526e+03 2.666e+03
Proportion of Variance 9.470e-01 3.853e-02 1.217e-02 9.500e-04 5.300e-04 2.800e-04 2.300e-04 1.600e-04 6.000e-05 4.000e-05 2.000e-05
Cumulative Proportion  9.470e-01 9.855e-01 9.977e-01 9.987e-01 9.992e-01 9.995e-01 9.997e-01 9.999e-01 9.999e-01 1.000e+00 1.000e+00
PC12      PC13      PC14      PC15     PC16      PC17      PC18      PC19      PC20      PC21      PC22
Standard deviation     2.349e+03 1.851e-06 2.318e-07 2.279e-07 2.25e-07 2.173e-07 2.151e-07 2.148e-07 2.081e-07 2.065e-07 2.002e-07
Proportion of Variance 2.000e-05 0.000e+00 0.000e+00 0.000e+00 0.00e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.00e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00 1.000e+00
PC23      PC24
Standard deviation     1.942e-07 4.214e-11
Proportion of Variance 0.000e+00 0.000e+00
Cumulative Proportion  1.000e+00 1.000e+00


But when I do:

biplot(pca1)


I get the following

the current PCA plot

the desired PCA plot

This is my first time doing a PCA plot so any help will be greatly appreciated.

Many Thanks,

Ishack

r proteomics pca • 4.8k views
0
Entering edit mode

Please see How to add images to a Biostars post to add your images properly. You need the direct link to the image, not the link to the webpage that has the image embedded (which is what you have used here)

0
Entering edit mode

Are your plots from the same dataset? How do you know that such a plot is possible with your dataset? They're both PC1 vs PC2 plots, maybe the nature of your data prevents the plot from being like #2?

0
Entering edit mode

Hi RamRS,

Thank you for your quick response

the second plot is from a different dataset but I would like to have the first dataset to have a plot similar to the second PCA plot, please

1
Entering edit mode

Check @Kevin's PCAtools package: PCA plot from read count matrix from RNA-Seq While this refers to RNAseq the principle should be the same.

0
Entering edit mode

Hi genomax, thanks for the link

I have created the following plot from this code:

library(factoextra)

fviz_eig(pca1)

fviz_pca_ind(pca1,
col.ind = "cos2", # Color by the quality of representation
repel = TRUE     # Avoid text overlapping)


The new PCA plot

is Dim 1 and Dim 2 same as PCA 1 and PCA 2?

0
Entering edit mode

As genomax says, you can just use my code from my other thread: PCA plot from read count matrix from RNA-Seq

Also PCAtools (https://bioconductor.org/packages/release/bioc/html/PCAtools.html) can be used - this was just released with Bioconductor 3.9