« A Strong Dollar? | Main | "This new priesthood" »

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

Excellent. I'm trying to put together some questions for the econometrics comprehensive exam.

I'll see if the doctors have any more for you. :)

Anyhow, you can imagine my delight when I calculated there's an x% chance that I'd be buying a whole different set of clothes, etc.

2/52, right?

I get 1/100

No, 1/51.

1/51

It's a good basian stats problem - It should work out to about 2%.

The scenarios are:
is male, shows male - 49%
is male, shows mother as female 1%
is female, shows actual female - 49%
is female, shows mother female - 1%

So the probablility of being the 2nd case, given shows female is 1% / (1% + 49% + 1%) = 1.96%

Given that the test shows female, there are two scenarios
1) sample from baby (98%)
2) sample from mother (2%)

So, probability that baby is male is

.98*(0) + .02*(.5) = .01 1/100

Or am I missing something?

1/51 seems right, which is as strong as i'll go without breaking out my old texts.

[P(male) - P(male|male_on_test)xP(male_on_test)]/P(female_on_test)

2 in 100 test results will give a false result, always female. 1/2 of those (1 in 100) will be males, half females.

98 in 100 test results will give a true result, half male and half female.

Of the 51 test results that show "female", 1 in 51 will actually be a male child.

d'accord


".98*(0) + .02*(.5) = .01 1/100"

I suck at statistics and am trying to figure out why exactly this is wrong. Bear with me through the conceptual work, please?

It seems the .98 is erroneous here. While 98% of the tests are accurate, once we are given the information that the test reads female, this percentage SHOULD be different, because the test is biased and will show female more times than it will show male. Specifically, a test that shows male is ALWAYS accurate, so the chance a female-positive test will be inaccurate is high enough to "create" that 98% accuracy rating that we have in the first place.

In 200 pregnancies, we have 100 boys and 100 girls.

Of the 100 girls, we get 100 "girl" results.

Of the 100 boys, we get 2 "girl" results.

So we get 102 "girl" results, but 2 are actually boys.

So the answer is 2/102 or 1/51.

Great question! Here's an finance version for Stephen's exam:

One in a thousand people have the skills to beat the market every year with a probability of 100%. The rest, who have no skill, will beat the market 50% of the time. A fund manager beats the market for 10 successive years. What are the odds the manager has skill.


P(male|female_result) = P(female_result|male)P(male) / P(female_result) (Bayes' Theorem)
= 0.02*0.5 / P(female_result)

and by the Law of Total Probability:

P(female_result) = P(female_result|female)P(female) + P(female_result|male)P(male)
= 1.0*0.5 + 0.02*0.5 = 0.51

so P(male|female_result) = 0.02*0.5 / 0.51 = 0.00196 ~ 0.2%.

Now, the question actually asked the *odds*, so P(male|female_result)/P(female|female_result) = 0.00196/(1-0.00196) = 0.00196, or the same answer (since P(female|female_result) is almost 1).

Of 100 tests, 49 will show "male," 51 will show "female."
Of the 51 "female" results, 1 is actually male.
Answer: 1/51.

"It seems the .98 is erroneous here."

Yup. The 98% is really a composite of [(probability test right given male result) + (probability test right given female result)] = [(probability male result on test) * (probability male given male result) + (probability female result on test)*(probability female given female result)] = 0.49 *1.0 + 0.51*49/51 = 0.49 + 0.49 = 0.98

In other words, all the error in the test occurs with a female result. Consequently, the probability that the sample is from the baby given a female result is not 98%.

P(Baby is male|Test says it is a female)=P(male and test says its female)/P(test says female)= (.02*.5)/(.51)=.01/.51=1/51

Wow, I am an idiot, it said odds, so it is actually 1/50

Heck, why not give my own method of finding the answer.

2 times out of 100 we extract wrong tissue. Of these 2 times, on average one is female and the other is male.

The other 98 times out of 100 we extract correct tissue. On average 49 of those times it is female.

So we show results of 51 females. However, of those 51 results, only 1 is actually male on average.

Hence on average 1/51 of the reported female results is actually a male, which is the droid, er result, you are looking for.

1/99

odds = p / (1-p) where p is the probability the fetus is male

Given the test shows female then the only way for it to be a male is if (1) the test took DNA from the mother and (2) the child is male. So p = 2/100 * 1/2 = 1/100.

odds = 1/100 / (1-1/100)
= 1/100 / (99/100)
= 1/99

Ok, so where did I go wrong?

I get .01/.51 = 1.96%

Let F be the event of female, T the event the test gives female.

P(F given T) = P(F and T) / P(T)

P( F and T) = P(F) = .5 because if the fetus is female the test must return female (since, presumably, the mother is female).

P(T) is (.2 + .5*.98) = .51. The .2 is the event from the event the mother's tissue is extracted, and if the fetus tissue is extracted, with probablility 98% you get a female 50% of those times.

Thus P(F given T) = .5/.51.


Thus P(male given T) = 1 - .5/.51.

@Josh M, yes, by asking for "odds" this is as much a wording question as a math question. I think "idiot" is a little strong though; after all, yours is the first correct answer! Asking for odds to be expressed as a fraction is somewhat ambiguous: a kind of misdirection.

K said: "Great question! Here's an finance version for Stephen's exam:

One in a thousand people have the skills to beat the market every year with a probability of 100%. The rest, who have no skill, will beat the market 50% of the time. A fund manager beats the market for 10 successive years. What are the odds the manager has skill."

You might also ask the obvious follow-up question. Given the answer to the first question what are the odds that the Fund Manager is actually running a Ponzi scheme? :)

Bob Smith: "Given the answer to the first question what are the odds that the Fund Manager is actually running a Ponzi scheme?"

Indeed: "One in one thousand has skill.  One in ten has dishonest tendencies. A manager who also runs his own brokerage firm and back office, and whose accountant is a guy with a basement office on 42nd and 9th..."  

Constant attention to prior probailities is at the core of basic financial literacy.  Students (all of them, not just the ones in econometrics) should be drilled with these kinds of questions until they are deeply and thoroughly cynical.

This is a great example of Bayesian probability. The interesting thing is if you play with values to approximate what is often published about diagnostic tests. Suppose you have a test that is trumpeted as "99% accurate". The main consideration is not the accuracy of the test, but the prevalence of the disease:

Probability a patient has condition X: 1/500 or 0.2%
"Accuracy" or reliability of a test: P(P) = .99 (sounds good)
Probability of a false positive: P(F) = .01

Prior probability of disease P(X) = .002
Probability of not having it P(-X) = .998

Probability that a person with a positive diagnosis actually has the disease:
[P(P)*P(X)]/[P(P)*P(X) + P(F)*P(-X)], or

(.99*.002)/[(.99*.002)+(.01*.998)]
= .1655 ~ 16.6%

In other words, for a condition with 1/500 prevalence and a 99% accurate test, there is only a 16.6% chance that a positive result is accurate, and thus an 83.4% that a person without the disease will get a positive result.

Hence, if you screen 10,000 people for the disease, you will ~120 positive results (1.196%), but ~100 will be false positives. You have to raise the prior accuracy of the test to an astronomically high level to really reduce the level of false positives. The prevalence (rarity) of the disease is more important to know than the accuracy of the test. The example Mike gives looks pretty good because the tested condition occurs in 50% of the population.

A variation of Shangwen's example is the Prosecutor's fallacy: http://en.wikipedia.org/wiki/Prosecutor's_fallacy
In a nutshell, if you have a rare event but look hard enough, you will find evidence of the event sooner than expected because the evidence found is incorrect. (Feel free to blast that explanation with a more correct description of the stats :)

@Phil Koop, Thank you for the kind words, good sir. Although it is not standard notation, (I have usually seen it with a colon, 1:50) it is not an illegal move. Heck, it is how wikipedia defines it, http://en.wikipedia.org/wiki/Odds, and as we all know, wikipedia is a reliable source (actually, ignoring the sarcasm, it is usually not bad when it comes to math and things of that nature.)

Love the blog, been reading for a long time, please keep up the good work.

I have a stats question.
Can ECON students answers to a statistics question be modeled as a random variable?

ECSE McGill University.

Peter, thanks for the reference. Very interesting indeed.

@Josh M, odds are a ratio of probabilities, and ratios are often expressed as fractions, so naturally there is nothing illegal about expressing odds as a fraction. For that matter, it is just as valid to express odds as a number; one could as well write 0.02 as 1/50.

But being legal does not make it something to be proud of. If you see a number like 1/50 or 0.02 in a discussion of probabilities, would it not be natural to assume that the numbers themselves are probabilities? That is particularly true when the numbers involved are small, as in this case, so that the probability and the odds are similar. The use of the colon disambiguates these numbers.

My view is that if the purpose of the question was to test reasoning about Bayesian inference, then asking for odds vitiates that purpose.

Isn't it a faulty assumption that there's a 50% chance the child is a boy? If you're going to try to be as precise as say 1.96%, then your accuracy will be skewed by the fact that more boys than girls appear to be conceived

The comments to this entry are closed.

Search this site

  • Google

    WWW
    worthwhile.typepad.com
Blog powered by Typepad