Question: P-hacking or not in pan-cancer analysis
gravatar for Wenhu_Cao
21 months ago by
Wenhu_Cao70 wrote:

Hi guys,

I have read a few pan-cancer analysis papers, really big papers from CNS. Then, I am confused about the way what they are doing with the data.

Normally, the data come from TCGA or other similar databases, these data are collected without any scientific hypothesis beforehand obviously, just dumped from bunches of sequencings and arrays (surely with careful selection, qc, normalization, etc). What those paper normally do is first to find statistical differences across all samples, cancer-types, genes, etc, then they 'zoom-in', to compare different subset of samples, cancer types, genes or other stuffs of interest, in order to find more delicate/subtle statistical differences, more interesting phenomena. At last, make up a story about it.

My question is, Isn't that a violation about statistical test assumptions? Aren't that comparisons multiple comparisons? Should we really analyze data after we see them and without any scientific hypothesis in advance?


pan-cancer genome p-value • 302 views
ADD COMMENTlink written 21 months ago by Wenhu_Cao70

Could you post links to these published manuscripts?

ADD REPLYlink written 21 months ago by Kevin Blighe63k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1665 users visited in the last hour