When I give true-false-uncertain questions, I say "marks are given for the explanation".

"True, False, uncertain, explain."

I really like that "probability of giving good grades to memorisers" vs "probability of giving poor grades to students who misinterpret question" trade-off. That's the hardest thing to get right. Good questions, that are on that PPF, are very rare. And if you re-use them too often, students just get old exams and study for the test.

Stephen: "When I give true-false-uncertain questions, I say "marks are given for the explanation".

I think that's a sensible way of reducing the "poor grades to good students" type error. I try to do so too, but every so often I do slip up.

Ah, The Curve. Blessed be its Holy Name, and Marvellous are it's Effects. *Bows at the invocation of the Holy Name*

Isn't there useful research available on effective economics assessment? Language teachers have a journal dedicated to language testing http://ltj.sagepub.com/ and there are many good books about testing and assessment.

Brett - there is the Journal of Economics Education http://www.tandfonline.com/toc/vece20/current . It tends to publish articles along the lines of "this is a cool way to teach XYZ" rather than articles on assessment. It's really a subject economists talk remarkably little about.

Devising great multiple choices questions is really tough. For example, there was a pretty lousy one on today's exam:

In the long run:
a) capital inputs are fixed
b) capital inputs are variable
c) all inputs are variable
d) all outputs are variable.

Now (c) is definitely a better answer than (b). But is (b) wrong? For that matter, is (d) wrong? I think (d) is not a good answer - the definition of long run is in terms of inputs, not outputs, and it is conceivable that some output might be fixed. But I think (b) is, in fact, correct.

"For example, there was a pretty lousy one on today's exam:"

I agree with that.

"Devising great multiple choices questions is really tough."

OK, but can good ones be used to test the knowledge of the underlying concepts?

OK, so one of the findings from language testing has been that multiple choice items should be constructed with a key and two distractors rather than three. Often the third distractor is a throwaway that fails to improve item functioning and may even make it worse. For every four or so distractors you remove, include an extra item. Such tests typically take no longer for the instructor to construct or for students to complete, and their reliability tends to be higher.

Too Much Fed:

Multiple choice questions are a good way of testing knowledge of definitions, of simple if-then type relations. F.g. here are a couple from yesterday's exam which weren't too bad.

10. A consumer’s preferences are represented by:
a. Indifference curves
b. Isoquants
c. Isocost lines
d. Budget constraints

And this multiple choice question has featured on intermediate micro exams for decades:

2. If average costs are below marginal costs, average costs are
a. Rising
b. Falling
c. Not enough information is given to tell

The interesting thing about that second question is that students would probably have done much better on it if I'd said "If marginal costs are above average costs, average costs are...."

Brett, I have honesty never heard the terms "key" and "distractor" before. I have no test banks and no idea how the 'item functioning' on my multiple choice exams works. To the extent that this happens in econ, it has been entirely commercialized, i.e. publishing companies create test banks and let instructors use them as a kick-back for making students by the textbook.

B.t.w., the answers are (a) in both cases - when marginal costs are above average costs, the last unit cost more to make than average, and so the average is rising.

Sorry, Frances! The 'key' is the correct choice in a multiple-choice item and the 'distractors' are the incorrect choices. Test items are designed to discriminate between students who have the target competency and those who don't. A well-functioning item, then, would do just that, while poorly functioning items fail to discriminate (because everyone gets it right or wrong, because everyone is guessing, because it's testing a different competency (e.g., politics instead of economics), etc). A quick and dirty way to check item discrimination is to import your item results into a spreadsheet, rank students by total score, and then look at the difference in average score on each item (between 0 and 1) of the top 1/3 (M1) of the class and that of the bottom 1/3 (M2).

In a general proficiency test such as the TOEFL, you probably want M1-M2 to be 1. In a classroom test, you hope that a large chunk of your students (say 80%) have actually learned the material. In that case M1-M2 should be more like 0.55. This kind of item analysis can be very helpful in identifying poorly constructed items and concepts that have not been grasped.

Brett, thanks, I think the Education Development Centre will do that kind of analysis for me on Scantron tests if I ask for it, or at least give me a spreadsheet with the question by question, student by student results, but I haven't taken advantage of it. They also offer courses on designing multiple choice tests, which I've also not taken advantage of.

That 3 choices rather than four choices result is really interesting, I didn't know that. I wonder if I'd have learnt that at the EDC course...

Brett - do you have a reference for the three not four options result?

This isn't the one I was thinking of, but it's a recent meta-analysis:
http://www.performancetest.org/documents/RodriqguesEdMeasurement3option.pdf

I find it strange that economics teachers are overly fond of the "True, False, Uncertain" type of questions, relative to other discipline teachers. Why is that? It doesn't seem to me that economics is special in that it needs different types of questions.

PS: Multiple choice should be banned. They don't allow people to be creative and show they really know, and they too readily allow people to memorize material (unless you make them trick questions, in which case you punish everyone).

Felipe: MC questions have their purpose (as well as being easy to grade). They are a good way of covering a lot of material quickly, if that material does have right and wrong answers. Many MC questions do just test memorization. But some can require analysis, and/or calculations, to get the right answer. When you see students drawing little diagrams and writing equations on their question sheets, to work out the answer, you know it's not just memory that is being tested. Sometimes I can't even answer my own first year MC questions without drawing a little diagram!

But yes, we can't do all testing with MC. And MC questions are really really bad if a student wants to argue a different answer from the one you expected. (I heard of one economist who always gives his students the option to write a one paragraph defence of their "wrong" answer, after the exam, and he marks it right if they take him up on the offer and write any good coherent defence.)

Brett: thanks!

Felipe - economists are fond of true/false/uncertain because it's a deductive discipline. T/F/U is basically a way of saying "logically work through the predictions of the model. Is this a prediction of the model?"

Nick is right, MC does have a role. See, e.g., the question above:
10. A consumer’s preferences are represented by: Indifference curves/Isoquants/Isocost lines/Budget constraints
This is really testing at a very basic level whether or not the student understands what the word preference means in economics, and what indifference curves are there for. It's so easy for students to imitate the pictures in the text and the examples done in class, without really understanding what's going on. (It looks like that question generated a nice distribution, b.t.w.).

Felipe: well-constructed multiple-choices are good for testing Bloom level 1 and 2 as well as some low complexitiy level 3 questions. These are not taxonomic levels where you want creativity. Even some level 4 and 5 questions can be asked if you know how. A good many publisher's test banks are now made by docimologists who know their taxonomy.
Unfortunately, higher-education teachers rarely, if ever, have a pedagogy formation. Even Cegep teachers , famous for our pedagogical bent, are not required to have studied in pedagogy.(I did and some of it can be useful...)

Bloom's taxonomy is a ranking of cognition level
Level 1 definition or remembering
Level 2 comprehensionor understanding
Level 3 application
Level 4 analysis
Level 5 Synthesis (some authors also put evaluating here instead of level 6)
Level 6 Criticism or evaluation ( both internal and external) and creation

The nastiest MC questions are of the form:

Wiggins and Bigglesworth in 1989 proved:
a- completely true statement demonstrated by Gagnon in 1991
b- completely true statement demonstrated by Rodrigues in 1981
c- completely true statement demonstrated by Wiggins in 1991 in 1981
d- completely true statement demonstrated by Wiggins and Butterworth in 1989

Question for the profs: on MC exams do you give negative marks for wrong answers (zero for no answer) so that the expected grade for guessing is 0?

Stephen: if you give marks for explanation, your question is of a too-high level for a true-false question. These are good for level 1. If at all. Too open for random choices. You then need a lot of questions to eliminate randomness. True or false should be avoided outside of grade school.
Your question should be something like: State the effect(s) of (change in situation) and explain how they happen. The answer must include (requirements).

Patrick: you must do that if you ask T-F. Once you go with thre or more distractors, the random element quickly fades . With 3 distractors, the random probability of getting it right is 1/4. So after 10 questions , the probability of getting everything right is (1/4) exp 10. Negligible. But you must have good distractors. Each one must be the result of a plausible mistake in reasonimg or computation. Each distractor must be obvious to someone making a mistake. Creating your distractor is extremely dificult though a good exercise fot testing your own comprehension.

Patrick, no, my multiple choice questions typically produce averages in the 60 to 70 percent range, I don't want to take them down any further! I suspect it would add to students' stress and widen the gap between high and low students.

Chris J - agreed, that is nasty. I don't like the long and complicated MC questions in general.

Jacques Rene, good point.

I think they do something similar for financial institution "stress tests", except the "department head" is "Hank Paulson" or "Nicolas Sarkozy" ....

Jacques, Frances: Thanks. Though i wouldn't agree that the tiny probability of a student randomly getting *eveything* right is a good argument against, but it's not important.

Frances,

"c) all inputs are variable"

Neoclassical propaganda!

Land, for all intents and purposes, is in fixed supply. Given the change in the quantity of land vs the change in price over the last few hundred years, the elasticity of the supply of land must be incalculably small (and if anything, the supply will shrink over the next 100 years). And that includes mineral resources which, despite protests to the contrary, do in fact constitute a fixed fraction of the mass of our planet. Smith, Ricardo and George will be vindicated in this regard over the coming century. Once we have general purpose, self-replicating (perfectly elastic supply) robots, labour and non-land capital will be irrelevant in production, and all income will be land rent.

Nick once wrote a great essay on the topic, something about robots and Malthus. His conclusion was "I hold land." Exactly right.

The short/medium run answer is "e) some capital inputs are variable." In the long run the answer is "a) capital inputs are fixed." (Interesting consequences for optimal taxation policy are an excellent bonus question.) The only answer I could not possibly justify is "c) all inputs are variable," the one you say is most correct.

I'd say the reason the question is bad is that it hinges on popular economists' assumptions about the right model of our current economy, but it's phrased as a general theoretical question. Personally, I think I understand the issues involved well enough to deserve full marks on that question. Under those circumstances there's nothing worse than having to sit there and have to try to guess at the prof's personal biases.

[Not accusing you of especially strong personal biases, btw. We all have biases. Just not fun having to guess them on a test.]

K "We all have biases. Just not fun having to guess them on a test."

That "long run" question does not involve guessing anything about the prof's biases. There is a standard definition of the long run and short run in micro econ: in the short run some input is fixed, in the long run all inputs are variable. Does this distinction have some inherent problems? Yes. Is it useful and clear and easy to understand? Yes. That question is one that the students had no problems with at all.

Being critical is important, but having enough understanding of basic econ principles to offer informed criticism is even more important.

Frances Woolley, do you teach:

CA deficit = gov't deficit plus private deficit?

Nick, Frances, Jacques - I contend that MC has no place in tests, because I don't really see the value of asking definitions. OK, maybe that is too strong a statement. But important tests (midterm or exams) really shouldn't be (to any meaningful degree) about definitions.

Frances - But why isn't it used in other deductive disciplines? Certainly economics is not the only one. I am frankly curious. Its not common to see physics questions like "TFU: applying heat on one end of an object will heat the other end". Should other disciplines look into asking TFU questions?

Frances,

OK, get it. Sorry. So my objection is more appropriately stated as "e) there's no such thing as the Long Run. However, in the long run, all capital is in fixed supply. "

I do apologize for the insinuation. Entirely my ignorance.

Definitions are the most basic concepts. They must be learned and tested. People on this blog usually work at the higher level. We analyze, synthetize, criticize,evaluate and create. Because we have mastered ( one hope) the more basic skills. Our students, ,in the beginning of their course at least, are at the basic level. MC are appropriate. Then we move on.
In my evaluations, I have test with MC testing levels 1,2 and very dmal level 3 problems,homeworks testing 3 and a little bit of 4 and a term paper testing everything up to 5. No creation as they are not at a sufficient level of proficiency.

K - another thing to remember - in aggregate, yes, there is a fixed supply of land. However an individual firm can increase or decrease the amount of land they use in their production process by buying or selling land. So even though land may be fixed globally, it is not fixed at the level of the individual firm.

Jacques Rene - yes, absolutely.

"True, false or uncertain" sounds like a trap. What is ever not uncertain?

KV - a statement is not uncertain when it's tautological (correct by definition) or inherently self-contradictory. As long as one can say "it depends" and what it depends on, uncertain is a good choice. This is what I mean by saying T/F/U questions rely on an understanding of the nature of logic.

Frances,

I assumed the point is to lay the groundwork for establishing the market equilibrium. If so, what matters is total supply and demand. I don't see the relevance of the fact that the consumption of a small firm is small relative to supply.

The common sense meaning of "long run" is "asymptotically," or longer than any characteristic time scale involved in production processes. That would be a neutral definition. But instead we *define* all capital inputs for individual firms to be "variable" in the Long Run, and then a short while later we transition to saying that the supply of all capital is variable i.e. elastic in the long run (lower case!) *equilibrium*, a total non-sequitur. That means that we aren't even allowed to contemplate equilibria with finite resources without having to invent new words for common sense concepts like long run. It also means that economists are indoctrinated from the very beginning with a meaning of equilibrium which precludes the existence of rents, and in which tax on capital (*any* capital) causes gross long run economic inefficiencies.  It's a clever sleight of hand and a very devious end run around the land/rent economics of the classical economists (Smith, Ricardo, Mill and George in particular). While logically independent, somehow the idea of elastic capital got inextricably hitched to marginalist economic thought.

K: " If so, what matters is total supply and demand. I don't see the relevance of the fact that the consumption of a small firm is small relative to supply."

In micro, long run/short run are firm level concepts used to refer to the firm's ability/inability to alter the type of production methods its using. In the short run, the firm has only limited ability to adjust the way it producers good and services in response to, say, a drastic increase in the price of helium, caused by the global helium shortage - surgeries are cancelled because there is no helium available. In the long run, the firm works out how to get by using less helium. Maybe it goes out of business if its business is making balloons filled with helium.

No, it's not macro, and doesn't solve all of the problems of the world, but the basic insight that people *do* adjust, *do* respond to incentives, things change is pretty important.

Frances,

"In the long run, the firm works out how to get by using less Helium"

Yes! I.e. some inputs are *fixed* in the long run. A model in which firms get by with less through capital substitution is *not equivalent* to a model in which all capital is in unlimited supply. If there are close capital substitutes then supply constraints wont be particularly binding at equilibrium, but lots of resources don't have close substitutes. Land, oil, atmospheres, etc are good examples of capital that is constraining in the long run equilibrium (though apparently not in the Long Run equilibrium! Just get more planets?). This has nothing to do with macro/micro. If a good (party balloons!) requires a certain kind of capital to produce and there are no good substitutes for that kind of capital (exploding party balloons bad!), then if the capital is in limited supply, so will be the good. In the long run there will be no party balloons, even if in the Long Run we will have unlimited quantities of balloons.

TMF: "Frances Woolley, do you teach: CA deficit = gov't deficit plus private deficit?"

That is a macro accounting identity. Frances teaches micro; she does not teach macro. So of course Frances doesn't teach that.

I teach macro. I teach that accounting identity G-T + I-S + NX = 0 in first year macro, second year macro, graduate macro. It is in all the macro textbooks, in various forms. It is NOT something invented by MMT. It is NOT something only MMTers know about. It is NOT deep secret knowledge that lays bare the meaning of life. It's a boring accounting identity.

Sorry, I didn't know what Frances taught.

I don't believe it is boring.

Could you tell me the assumptions behind it as it relates to medium of exchange?

I don't mean how it is derived. Bill Mitchell does that here:

P.S. Should this discussion be moved to a different post?

Nick: These macro accounting identities hold only when you are at full employment . At which point the reveal whatidentities are: rather uninteresting tautologies.
Once you introduce money and the reserve account on topof unemployed resourece, the fun begins.
As I remarked on Dean Baker's blog today
http://www.cepr.net/index.php/blogs/beat-the-press/china-and-protectionism-it-aint-quite-as-simple-as-they-tell-us
if you have unemployed resources ( as the massive internal migrations China show) and a delay between the disbursement (in money) of foreign investment and the payment of foreign goods ( either investment goods or consumer goods that let the receiving party prduce investment goods), then you can have momentarily a surplus in both account. Given that is not easy to acheve, a rapidly growing country can have a Current account surplus.

It does not disturb me.

