I'm new in the field, and I read some plos articles. I meet the term p<0.05, but I don't know what it means. Can you give me a keyword or something to help me understand what that p is about?
During the cranberry period, 6 subjects had 7 UTI, compared with 16 subjects and 21 UTI in the placebo period (P<0.05 for both number of subjects and incidence).
From Wiki: http://en.wikipedia.org/wiki/P-value
"the p-value is the probability of obtaining a test statistic at least as extreme as the one that was actually observed"
The purpose of a p-value is to determine if a value you observed is significant. Usually your p-value comes from an observation that you assume is governed by events whose probabilities can be represented by a control distribution. Usually your hope is that your observed value differs significantly from the mean of the control distribution.
Maybe its best explained with an example:
Suppose you were trying to prove that a coin you had was weighted towards flipping heads 70% of the time. We know that an un-weighted coin will give heads or tails with equal probability, so we need to show that your coin will give heads more than tails if flipped X amount of times. A weighted coin will produce (number of times flipped)(probability of being heads ) = (100)(.5) = 50 heads in 100 flips on average. Your coin will produce (100)*(.7) = 70 heads in 100 flips on average. A p-value is a statistic that answers the question "How much does my observed value (in this case 70 heads) need to differ from a control value (in this case 50) in order for me to conclude that my coin is weighted?"
I have reopened this questions for the following reason, I have just finished interviewing 5 bioinformatics candidates selected from 25 applicants. I prepared 10 questions for each of them; one question was in essence: What does a p-value mean?
To my great shock not a single candidate was able to answer or was even on the right track! The better ones believed that p-values are some sort of a similarity measure. Yet most of them had multiple publications in selective journals (using p-values of course).
The current science education in general does a horrible job when it comes to educating researchers in basic descriptive statistics. The "p-value abuse" is rampant in peer reviewed publications. When a simple concept is misunderstood by so many the problem lies in the education not with the individual.
So I'd welcome this question I think it is relevant and important, and for everyone that thinks this is too easy or obvious, walk around your lab and ask a few collaborators what a p-value is, you'll be surprised (or saddened).
Given a test of some effect where p = 0.05 according to some test-statistic t, you expect to see a test-statistic greater than t (or equivalently, a p-value < 0.05) on a test of random data 5% of the time.
By accepting p = 0.05 for a single test, you're accepting that there's a 5% chance that effect or difference may be due to random variation--and that there may not be an actual "effect" at all.
A consequence of this is that, if the magic number for publication is p <= 0.05, then, the expected number of publications which have erroneously rejected their null hypothesis is:
p_average * N_publications
so, given a fixed p-value cutoff, we can expect the number of falsely rejected null hypothesis to increase as more papers are published.