Wednesday, December 3, 2008

Holes in Medical Statistics Part 1: Understanding the P value through Poker

Is there a misunderstanding about the use and interpretation of P values and hypothesis testing? An excellent set of articles written by SN Goodman makes this case. This is the first in a series of blog entries to help illustrate the points in these papers.

To help clarify the concept of the P value, I will start with a contrived poker example:

Let's say you're in Las Vegas hoping for a little enjoyment and to win some cash. You enter one of the seedier downtown casinos and find a new game. It's called "Huslyall". The dealer is rather unkempt, wears an eyepatch and has a hook replacing his left hand.

"Here's the game", he says. "Poker. 52 card deck. The bet is $100. I shuffle and lay down 5 cards. If I get a straight flush, I win and take your $100. Anything less than that, you win and I give you $100."

You are suspicious, but interested. "This will be easy money if this is a normal deck," you think. "On the other hand if the deck is fixed, I'm out $100." Against your best instincts you decide to play the game (after all it seems less risky than the stock market.)

The dealer shuffles the deck (which is amazing given the hook) and lays down 5 cards. He then arranges them in order: 6,7,8,9,10 - all hearts.

Question: What is the probability of drawing a straight flush from an ordinary deck of cards? Answer: prob = 0.0000139.

Question: What is the P value for this particular occurrence?

Answer: First the definition of the P value -
"The P value is defined as the probability, under the assumption of no effect or no difference (the null hypothesis), of obtaining a result equal to or more extreme than what was actually observed."

An important point about the P value is that one assumes that the null hypothesis is correct. The P value is then calculated by the probability of the observed effect or a more extreme effect.

Getting back to our poker example, we assume first that the dealer's deck is a normal deck. We then calculate the probability of obtaining this ordinary straight flush (prob = 0.0000139) + the more extreme effect which would be a royal straight flush (prob = 0.00000154). The P value in this case is:
P = 0.0000139 + 0.00000154 = 0.00001544.

So this means that we reject the null hypothesis, right? Wrong.

Key Point: As stated previously in order to calculate the P value we actually accept the null hypothesis. We then calculate the odds of the observed effect or a more extreme effect. You cannot reject the null hypothesis that you just accepted.

So when do we get to reject the null hypothesis? By hypothesis testing which, as we will see, is a completely separate statistical idea. It has however become entangled together with the P value. This has brought us to our current state of confusion. More on this to come.

Should you knock the dealer in the jaw and take back your money? That's up to you...but that hook looks awfully sharp. Better consider discussing your case with casino management first.

No comments: