Question: p-value with chi-square test
gravatar for Biologist
7 months ago by
Biologist150 wrote:

From this Research paper Table1 Association of RAD51-AS1 expression with clinicopathological features of EOC patients I see that p-value is calculated based on Chi-square test.

 Age   Low-RAD51-AS1  High-RAD51-AS1 P-value
 <50    25 (38.5)      17 (26.6)       0.149
 ≥50    40 (61.5)      47 (73.4)

For the Variable Age the p-value is 0.149

But when I calculated it gave a different value.

data <- data.frame(x= c(25, 40), y=c(17, 47))
chisq.test(data, correct = T)

    Pearson's Chi-squared test with Yates' continuity

data:  data
X-squared = 1.5728, df = 1, p-value = 0.2098

It is not only with Age even the rest all variable data also gives different p-values compared with the p-values in the Research paper.

What could be the reason for this different p-values? Did I do anything wrong?

ADD COMMENTlink modified 7 months ago by Kevin Blighe39k • written 7 months ago by Biologist150
gravatar for Kevin Blighe
7 months ago by
Kevin Blighe39k
Republic of Ireland
Kevin Blighe39k wrote:

Just switch off the continuity correction.

chisq.test(df[,c("High", "Low")], correct=FALSE)

    Pearson's Chi-squared test

data:  df[,c("High", "Low")]
X-squared = 2.0794, df = 1, p-value = 0.1493


ADD COMMENTlink written 7 months ago by Kevin Blighe39k

Thank you Kevin. I would also like to know Is it wrong calculation if correct=TRUE. At what times it should be TRUE?

ADD REPLYlink written 7 months ago by Biologist150

My background is not pure statistics - it's biology and computer science. That said, bioinformatics overlaps into statistics and many bioinformaticians understand much statistical methodologies, myself included.

Whilst I cannot give a complete definition of continuity correction, I am aware that it is used for slightly similar reasons as performing P value adjustment in expression studies, that is, to prevent overestimation of the statistical significance. When we conduct Pearson Chi-squared test, the assumption is that the frequencies in our contingency table follow a binomial distribution, which is not often true. The continuity correction attempts to 'adjust' for this situation.

If you want to delve further into it, I suggest posting on StackExchange.

ADD REPLYlink written 7 months ago by Kevin Blighe39k
Please log in to add an answer.


Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Powered by Biostar version 2.3.0
Traffic: 1144 users visited in the last hour