When you learn data analysis, you hear a ton about p-values. They are something of which you develop an understanding as you study more and more statistics, but giving a clear, easy-to-understand, quick definition of the concept is a challenge, as this great piece from the blog FiveThirtyEight argues. Watch the pros struggle to boil it down to one easy sentence:
This is an explanation that you will have to learn to better communicate with non-statisticians.
In stats class, you learned that a p-value is the probability of getting a result at least as extreme as an observed result if the null hypothesis is correct. For example, imagine that I wanted to test the theory that getting a graduate degree increased your earnings:
I begin with the assumption that there is no relationship. Such a move is intended to scientifically conservative, in the sense that I assume the theory is wrong and wait for data to disprove that assumption. Such an approach is used instead one that assumes that the theory is right and believes in it until disproven. So the null hypothesis is no effect — zero differences in earnings. We proceed by collecting earnings data on two comparable populations that differ by grad school achievement. If we get a result that generates a p-value of 0.02, then we estimate that our results will occur 2% of the time if there is in fact no effect in reality. In other words, a p-value of 0.02 is a sign that our experiment either registers an extreme, unrepresentative sample, or the null hypothesis isn’t true.
So how to explain p-values in the standard context of a hypothesis test whose null assumes no effect. My attempt is as follows:
A p-value is the estimated probability of getting these results if there is in fact no real relationship.
Of course, it is an assumption-laden estimate (as is true of any estimation procedure), but that is the basic thrust. Or at least my attempt to transmit it as simply as possible.
Photo Credit. “This image is taken from Page 82 of Mathemaical contributions to the theory of evolution.” by Medical Heritage Library, Inc. is licensed under CC BY-NC-SA 2.0