Deseq2 output and robust PCA (PcaGrid and PcaHubert)
0
0
Entering edit mode
3 months ago
Emma • 0

Hello there,

I followed the deseq2 file to get the list of differentially expressed genes of a dataset I am working on. The Pca graph I got was not enough to show which one of the replicate is an outlier. A little background about the project, it is about a rare disease, we have longitudinal data collected for 8 times. The RNA-seq data have two treatment condition (treatment and negative control), with 3 replicates of each group. I found an article titled "Robust principal component analysis for accurate outlier sample detection in RNA- Seq data. link: https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03608-0" that did something called rpca oultier map which shows the replicates instead of each gene (as it shown on my graph below). In their figure 1 and 2, they have 8 graphs. in their explanation the authors wrote: Fig. 1 Comparing the performance of cPCA and rPCA on the simulated data. a cPCA plot of the simulated baseline data with two treatment groups and 3 biological replicates each. The first principal component captured the variation of the baseline samples between the two groups. b cPCA plot of the simulated baseline data plus outlierL1; The first principal component was attracted by outlierL1. c cPCA plot of the simulated baseline data plus outlierH1; The first principal component was attracted by outlierH1. d-f Outlier maps of the simulated baseline plus outlierL1 data set using (d) cPCA, (e) PcaGrid and (f) PcaHubert. (g-i) Outlier maps of the simulated baseline plus outlierH1 data set using (g) cPCA, (h) PcaGrid and (i) PcaHubert. OutlierL1: simulated sample L-1 of the low “outlierness” group. OutlierH1: simulated sample H-1 of the high “outlierness” group. Sample 5: the 5th sample of the baseline data set and Fig. 2 Comparing the performance of cPCA and rPCA on the composite real RNA-Seq data of the human cerebellum dataset (a-d) and the mouse dorsal root ganglion (DRG) neuron dataset (e-h). cPCA plot (a) and outlier maps for human cerebellum dataset using cPCA(b), PcaGrid (c) and PcaHubert (d). cPCA plot (e) and outlier maps for the mouse dorsal root ganglion (DRG) neuron dataset using cPCA(f), PcaGrid (g) and PcaHubert (h). When I do the classic rlog pca, I get three replicate for each condition, but when I try to run PcaGrid or PcaHubert function to get pca or outlier map, I get every gene in the list plotted (26K) on the graph. Please advise on how to perform this analysis. Thank you!

The codes I am following for the classic pca: rld <- rlog(dds) plotPCA(rld) + geom_text(aes(label=name),vjust=0.5) The codes for the pcahubert and pca grid

I tried to replicate the codes from this link but no mention of how to use deseq2 output in this one: https://mirror.csclub.uwaterloo.ca/CRAN/web/packages/rrcov/rrcov.pdf

rPca Deseq2 RNA-seq cPca • 407 views
ADD COMMENT
0
Entering edit mode

enter image description here

ADD REPLY

Login before adding your answer.

Traffic: 3808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6