Question: P-hacking or not in pan-cancer analysis
0
gravatar for Wenhu_Cao
10 months ago by
Wenhu_Cao50
Wenhu_Cao50 wrote:

Hi guys,

I have read a few pan-cancer analysis papers, really big papers from CNS. Then, I am confused about the way what they are doing with the data.

Normally, the data come from TCGA or other similar databases, these data are collected without any scientific hypothesis beforehand obviously, just dumped from bunches of sequencings and arrays (surely with careful selection, qc, normalization, etc). What those paper normally do is first to find statistical differences across all samples, cancer-types, genes, etc, then they 'zoom-in', to compare different subset of samples, cancer types, genes or other stuffs of interest, in order to find more delicate/subtle statistical differences, more interesting phenomena. At last, make up a story about it.

My question is, Isn't that a violation about statistical test assumptions? Aren't that comparisons multiple comparisons? Should we really analyze data after we see them and without any scientific hypothesis in advance?

Confused...

pan-cancer genome p-value • 198 views
ADD COMMENTlink written 10 months ago by Wenhu_Cao50

Could you post links to these published manuscripts?

ADD REPLYlink written 10 months ago by Kevin Blighe48k
Please log in to add an answer.

Help
Access

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1817 users visited in the last hour