How does PCA (principal components analysis) say the majority of changes is contributed PC1 (gene and environment)
1
0
Entering edit mode
3.0 years ago
huangjinyu • 0

I would like to make an analysis similar to this NATURE article to get an analogous conclusion. This paper made ATACseq analysis on different tissues, and then PCA analysis on ATACseq results, so as to get the conclusion that the majority of changes contributed by cooperative effects of tissue damage and mutant Kras early on in tumorigenesis (PC1: 56%), rather than the later transition from early neoplasia to PDAC (PC2: 16%).

This conclusion is really shocking! As far as I know, we generally do not understand the actual significance of the calculated principal components, so how can it be said that PC1 represents the specific Cooperative effects of Tissue damage and mutant Kras effect? How to understand and replicate the PCA analysis in this study, PC1 has more variance so the change is mainly in the early stage of the disease how do we get this conclusion?

the article:

https://europepmc.org/article/pmc/pmc8482641

enter image description here

enter image description here

ATAC-seq PCA • 1.0k views
ADD COMMENT
0
Entering edit mode
3.0 years ago
LChart 5.1k

We could file this under over-interpretation of PCA, see: https://www.nature.com/articles/s41598-022-14395-4. Statements such as this should be backed up by specific metrics such as number (or intensity) of shared differentially accessible regions, and adjusted for sequencing coverage, with appropriate statistical tests.

Note especially that there are 14 injury/KRAS/both samples and only 4 PDAC samples; so injury/KRAS explaining "more variance" is completely unsurprising since the size of the population is far larger.

My read of the PC1 and PC2 is that (1) PC1 identifies a set of changes that appear to be highly stratified between PDAC and normal, and partly shared with KRAS/injury; and (2) PC2 identifies a set of changes that seem to be unique to KRAS/Injury, and that are either not present or self-canceling in PDAC and normal.

However, it is the case that the majority of peaks they identify in injury+KRAS are also shared with PDAC (panel c) which demonstrates some similarity between the conditions. It would be nice to expand panel (e) to also include peaks called in PDAC to identify the PDAC-specific peaks.

ADD COMMENT
0
Entering edit mode

“Note especially that there are 14 injury/KRAS/both samples and only 4 PDAC samples; so injury/KRAS explaining "more variance" is completely unsurprising since the size of the population is far larger.”the "more variance" refer to the change of normal (3 samples) to injury+KRAS (6 samples), and for normal/injury(3)/KRAS(5)/both(3)/PDAC, group samples are close.

when we get a PC, its a potential metrix with no concrete meaning. Genernally, we cannot say the pc1 is something xxx, so the expression a lit bit improper here.

I guess threre should be a over-interpretation of PCA as you say. But it is the NATURE article, so I doubted myself from the beginning.

Thanks for your response!

ADD REPLY

Login before adding your answer.

Traffic: 2840 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6