« A simple Keynesian-monetarist brain-teaser | Main | US AAA: bleg and random thoughts »

Comments

Feed You can follow this conversation by subscribing to the comment feed for this post.

I'm trying to figure out how the sheer number of possible explanatory variables, relative to the sample size in the data, fits into your exogeneity-plausibility trade-off.

I don't really follow the empirical cross country growth literature much. But someone did a paper a couple of years back making the point that the number of possible vaguely plausible explanatory variables was bigger than the sample of countries in the data set. So there were negative degrees of freedom. I think of the genetic diversity and penis length papers in that light. (Though yes, the penis length one was definitely at least tongue in cheek, and probably intended to make this very same point). And if you allow the relationship to be upward-sloping, downward-sloping, or U-shaped, or inverted-U-shaped, (like in genetic diversity and penis length) you lose even more degrees of freedom.

Plus, we can't help but peek at the data before testing the hypothesis. For example, I would guess that I could find some sort of statistically significant relationship between growth and latitude and/or longitude, especially if I can do U-shaped models. (I bet some growth theorist has already done this!)

(OT. "But nowadays it takes a few clicks to download data from the university library, and anyone who can type "regress earnings education experience" into a computer can run a regression." Some of us are just miserably incompetent at this sort of thing. I tried and failed to download Shazam. Never got past step 1.)

Nick, "I would guess that I could find some sort of statistically significant relationship between growth and latitude and/or longitude, especially if I can do U-shaped models. (I bet some growth theorist has already done this!)"

Someone has argued that "evolutionary novelty" as measured by latitude/longitude drives intelligence, see here, and its a short step from there to economic growth.

There's also the colonial institutions argument. I.e. the British set up good institutions in places like Canada and Australia, because these are suitable places for British-type people to live (I suspect the people who came up with this argument have never spent a winter in Ottawa. Still, there is some truth behind the old saying "warm winter, full graveyard"). The British set up bad institutions in places like India, because the climate was too harsh for British-type people, and the British were more interested in just stripping the place of resources. And these institutional legacies explain the latitude-growth relationship.

With negative degrees of freedom, perhaps there's a novelty/plausibility trade-off? That is, the quest for novelty and originality causes people to turn to ever more obscure explanations?

"anyone who can" - does that take us back again to the many meanings of can?

I think Regression discontinuity fix this problem. If you find a good discontinuity you don't have the endogeneity problem and don't have a ''wacky explanatory variable''
Natual experiment is another good solution using DD. But there are a bunch of people not liking it...

I personally do see the point of being more carefull with the empirical strategy. Just to make sure you get the sign of the coefficient right at least...

IV papers are getting less popular (except the one from natural experiment). It is very tough to convince people that your IV variable is not correlated with the error term...
The paper you mention on institution got criticized for this reason and some possible measurement errors...

Some people would suggest that the solution is to put more structure and do structural paper... I don't think this is the solution of all problems... It seems that a lot of people doing this kind of research seems to forget the goal of economics...which is to answer good economics question. They sometime get caught up in their model...

being a labor economist is tough those days. Some collegue argue you don't work on difficult stuff... I usually reply that I answer good questions... At least top journal still publish reduced form empirical paper so it should not disappear...


John: "If you find a good discontinuity..."

That's a big if! What do you do if you can't find a good discontinuity - use a bad one, or give up and go home?

You're right about being careful with empirical strategies. E.g., right now there's a big push towards increasing financial literacy, because there's a strong correlation between higher levels of financial literacy and higher levels of savings. But correlation is not causation, and I'm skeptical.

"The inverse u-shaped genetic diversity/economic growth results are driven by two regions: Africa with high degree of genetic diversity and low economic growth, and pre-Columbian America with low genetic diversity and low growth. In each case, I suspect there are more plausible explanations of economic outcomes than genetic diversity."

I am a form believer that these types of hypotheses are only remotely worthwhile if you had an a priori hypothesis and then were able to test it multiple times and in multiple ways. But seeing a u-shaped curve and then trying to explain it is dangerous (at best) and likely foolhardy.

Instrumental variables have been showing up in Epidemiology and I have been quite concerned by them. Unlike traditional regressions where the goal is to determine issues like: selection bias, quality of measures, and confounding, the IV papers simply tell a story as to why a variable is an instrument (typically without proof). Sometimes the results of an IV analysis are compatible with randomized experiments but not always. And, when there is no experiment around, how do you know for sure???

@Joseph: I would argue exactly the opposite holds: if I just run a traditional regression, my results may be plagued with issues such as selection bias, measurement error, and confounding, but I have no statistical way of getting at any of these issues. If I use IV, I can characterize and control for selectivity, remove confounding, and possibly correct for poorly measured variables. I cannot do so free, but "there's no free causality," or however that phrase goes. Under the assumptions I need to make to do IV, the IV estimates tell a much richer story than those from a standard regression. IV is also commonly used even when we have a proper randomized experiment, because proper randomized experiments usually have problems, too. If we had an idealized RCT, regression analysis of any sort is probably of limited value.

Andrew Gellman's criticism doesn't hold water. In econometric jargon, he said, `I don't learn anything at all from the structural estimates, just show me the reduced forms.' That is simply mistaken. Consider a concrete example: patients are assigned a treatment Z, actually take a treatment D, and have outcome Y. The IV estimate here tells me the causal effect of treatment on compliers, which is likely something in which I am keenly interested. I can also estimate how precise that estimate is, which is likely pretty darn useful too. Now, I may also want to report the reduced form Cov(D, Z), which tells me about how effective assignment is in altering treatment choices, and Cov(Y, Z), which tells me how assignment affects outcomes (and is called "intent to treat" analysis in the medical literature), but contrary to Andrew's claim I really do also want to know the ratio of those terms and conduct inference on that ratio, because what I'm ultimately interested in is the effect of D on Y, not the effect of Z on anything. If I have multiple endogenous RHS variables, Andrew's preferred strategy becomes unwieldy and largely uninformative.

Getting back to Frances' post: I view the literature not as looking for more obscure explanations, but rather as looking carefully for the glimpses nature gives us into how she works. Both IV and RD estimate local effects which may not be what we really want, but "Better LATE than nothing" (http://www.nber.org/papers/w14896). See also the previous papers in that debate by Deaton and by Heckman for insightful critiques of IV.

Saw a paper presented while in grad school. Author used ground conductivity as an instrument for AM Radio availability, which he then used to explain distribution of money during the great depression. Radio availability increased political efficacy, but the endogeneity problem would have been that richer places would have had stations first. He presented the usual econometric tests for instrument exogeneity. I asked why he didn't just check whether conductivity affected spending prior to radio. Paper hit the QJE with the cool tech tests but without the obvious plausibility test. Maybe the spending data didn't go back far enough to do it.

Depending on the study weather may not really be exogenous!

Since the weather is somewhat predictable in advance, it is possible that prices and behaviour represent information about weather and thus is not truly exogenous.

Kevin - yes, true enough.

Eric - that's a perfect example of exactly what I had in mind.

Chris, yes IV techniques can solve a host of problems. But sometimes it's an 'assume we have a can-opener' type solution - without a good instrument, you're no further ahead.

It sounds to me like economists are trying to do something with regression analysis that it simply can't do. Regression, as a tool can allow us to see to what extent a phenomena generalizes, but it tells us very little about the causal mechanisms. Far too often scholars think of regression as "proving" a relationship, rather than "not rejecting" one. To me instruments suffer from the problem that they imply we can know that certain things are endogenous or exogenous, before running a model. It also begs the question of how many plausible instruments were run, but discarded because they did not produce the appropriate p-value. Given confirmation bias, the tendency of the dataverse to begin in 1970 or 1980, and the whole litany of problems facing statistical analysis, it is probably best to think of a 95% confidence interval as a hoop good theories must jump through, not as a causal relationship.

I'm an outsider here, but it seems that case studies - particularly of the process tracing variety - are the road less traveled within economics (incidentally, I wouldn't say that about my own field). If say, Barro finds that a British colonial legacy matters for economic growth, then the next order of business is to figure out what those linkages are, and see whether they hold empirically. Is this soft science? I'm not so sure - in fact in many ways it is precisely this that separates economics from the "real" natural sciences. Regression analysis of the black plague might tell us that either rats or cats (whose population tends to follow that of rats) were guilty of spreading the plague. It is deep observation that could settle the argument (eg. following the path of the germs with a microscope).

Of course, I suspect this type of course is not likely to be pursued, until the balance of prestige swings away from formal modeling and regression and toward case studies and perhaps experiments (for micro, neuro-economics seems promising in that regard).

Patterns of economic growth? I thought Jared Diamond explained all this already.

hosertohoosier: "I'm an outsider here, but it seems that case studies - particularly of the process tracing variety - are the road less traveled within economics (incidentally, I wouldn't say that about my own field)."

What's interesting about this gender and the plough research which is getting people excited (and don't get me wrong, it's good work) is that the idea that gender roles in agricultural production had broader society-wide gender role implications has been around for years.

So that's an example when case study approaches identified the variable first. Indeed, going and reading the latest research in sociology or anthropology or political science or psychology and then stealing the ideas is definitely a sound research strategy.

Not sure about experiments, will probably write a post about that in a few days.

I thought we did multiple regression precisely because we didn't do randomised experiments.

We want to measure the effect of X on Y holding other things like Z constant. So we run a regression Y=a+bX+cZ+e. We hope that e is uncorrelated with X. We bung as much stuff as we can find into Z, so it doesn't show up in e. But we always fear we have left something out, and have the missing variable bias on our estimate for b.

In a randomised experiment, we toss a coin to determine X, then do a simple regression Y=a+bX. Since X was determined by toss of the coin, the toss of the coin ought to be uncorrelated with Z and e, so X ought to be uncorrelated with Z and e.

I'm out of my depth, and maybe missing something.

Nick: this seems correct to me. people in the empirical profession are very difficult to convince that we have variables for everything in z that is potentially correlated with e. I don't think it is possible to publish an OLS paper anymore. except maybe in political economy where they just want to show the intuition of their model is correct...

pure randomized experiment are hard to find and including Z is often necessary. By exemple, a certain type of people participate in a program (not random) that is randomized. That is why angrist and other still suggest using IV for those papers...

I can't reconcile hosertohoosier's remarks with a modern understanding of the word "causal" (see Pearl, Heckman, Rubin, Cartwright, etc).

We do not need to understand everything on a causal path between A and B to claim that we have good evidence that A causes B, and if we were to try to maintain that claim a huge swathe of all of the social and natural sciences would have to be discarded. For example, the mechanisms through which smoking causes cancer are still not fully understood, but it would be ridiculous to claim that we cannot say we have evidence that smoking causes cancer.

As a straight econ example: we have very good evidence that changes in tobacco tax rates cause changes in smoking patterns, and I don't think it sensible to claim that we need to fully understand the neurophysiological mechanisms involved before we can conclude that taxes affect smoking. That said, the claim that economists are uninterested in studying mechanisms through which causal effects take place is also wrong.

Economists do study controlled experiments when possible, and note that the analysis of controlled experiments does often use regressions. Particularly in the behavioral and medical sciences even controlled experiments are often subject to serious problems (e.g., non-compliance) and the analysis is often very difficult. Even perfect RCTs are commonly evaluated using regression analysis, as conditioning on covariates can increase precision (for that matter, even plain differences in sample means can be interpreted as a regression estimate). The claim that regressions categorically cannot provide evidence on causation is simply wrong.

Statistical analysis has enormous advantages over case studies in most economic contexts. The insult that economists use statistics due to "prestige" rather than scientific merit is poorly aimed.

Generally, hosertohoosier, I think it would be appreciated if you turned the sneer down a few large turns of the dial.

@Chris: In my field, Epidemiology, we have a concern with whether an instrument really is an instrument. For example, researchers have claimed that physician drug preferences can be operationalized by the last prescription given (to a previous patient). Now, I have published on using this proposed instrument and it is a clever idea. But occasionally it seems to give . . . unexpected answers when compared with the RCT.

I merely worry that I don;t have a good sense of how o demonstrate that a proposed instrument really meets the full set of required assumptions. It is quite plausible that Economics has a better handle on this concern as most of the really good IV papers I read came from Econ Journals.

@Joseph: Sure, econometricians are also keenly aware of the problems that arise when the IV assumptions are violated.

Particularly in models in which there are more excluded instruments than there included RHS endogenous variables there are a variety of tests available against the hypothesis that instruments are valid. However, these tests usually have low power, and are very problematic or just plain incoherent in the presence of heterogenous causal effects that are correlated with unobserved determinants of the endogenous regressor. Sometimes "anti-tests" of instrument validity are more compelling, as per Eric Crampton's comment above.

Some methods allow pretty good inference even when the instruments are problematic; given the title of this thread this paper by Conley et al is a good example: http://www.princeton.edu/~erp/erp%20seminar%20pdfs/papersspring09/Rossi%20paper%202.pdf

I don't think I understand your example. In what context would anyone propose that a physician's last prescription is a good instrument for anything? Do you mean the lagged prescription is being used to instrument the current prescription? In which case I would certainly agree with your concerns.

Note an aspect of IV estimation which makes it problematic to compare directly to RCTs is RCTs identify the average causal effect whereas IV identifies local causal effects. Thus, if the two methods come to differing conclusions, we cannot necessarily conclude the IV estimate is misleading. Since policy and scientific questions often hinge on local rather than average effects, this is actually often an advantage of IV over RCTs.

Sorry about the link, let me try again:

Conway, Hansen and Rossi (2008) "Plausibly exogenous."

Chris, thanks for your comments. I'm not sure that I'd agree that policy and scientific questions often hinge on local rather than average effects however.

To insert a hyperlink type SHIFT, a href=http://whatever SHIFT. text that you want to show SHIFT, /a SHIFT.

The symbols generated by typing SHIFT, and SHIFT. don't appear onscreen when using html, they're the ones that looks like this symbol: ^ rotated.

Frances, thanks for the typepad tips.

Why do you think local effects are not often of interest? We are commonly, perhaps as economists usually, interested in evaluating effects on some margin, and marginal and average causal effects might differ substantially.

You can find formalized versions of such arguments in, e.g., Heckman and Vytlacil 2001.

Chris, what you asserted in your original comment was that policy and scientific questions often hinge on local rather than average effects.

Sometimes policy questions hinge on local effects. Sometimes policy questions hinge on average effects. Which is more common? Which are the more important policy questions? I don't know.

It doesn't have to be too complex. Don't underestimate the power of OLS.

From my experience, the fancy methods don't work well in many real world cases because the underlying data is noisy/unreliable. OLS works well in these cases because it averages everything out.

If I had to put up money based on my results, in the markets for example, I always go with OLS.

@Chris, I'm not saying statistics should be discarded (I use them in my work and I'm generally all for defending against the qualitative barbarians at the gates). As you note, there are research questions for which the statistical evidence is so robust (and the policy implications sufficiently large), that there probably is something going on.

But that is not true of the majority of research questions, and particularly untrue of problems that require instrumental regression (or other fixes for simultaneity), which invites validity arguments. Case studies seem like they would be better suited to solving some of these issues than endless fights over the right instrument. What is more, they could give economists more robust theories. A lot of datasets in econ are limited in scope (usually inter-temporal scope), which effectively holds a number of things constant. An individual case may be more useful than a regression at understanding just what is being taken for granted.

For instance, think about the Philips Curve debate. You had a compelling, policy-relevant result that was strongly supported by the data. The problem is that it drew from a limited number of time points, during which inflation expectations of firms, individuals and unions was close to zero. A detailed study of German hyperinflation in the early 20s could probably have told the literature much more than the Philips could with decades worth of data - ie. expectations matter during periods of persistent inflation. These kinds of blind spots have serious real world consequences, and are worth addressing with a greater attention to detail, as well as tools that test for general validity (insofar as they can given limitations of data, etc.).

@Sina, OLS like all statistical methods requires that certain things be true. When those assumptions are violated in a big way, it is not an appropriate tool. You're right that it is a bad idea to use a pneumatic hammer when a simple hammer will do, but it is also a bad idea to use that same hammer when the job requires a wrench. Test for violations of OLS, and if they exist, use the appropriate methods.

@ Chris: Yes, it is the lagged prescription that has been proposed as an instrument.

The comments to this entry are closed.

Search this site

  • Google

    WWW
    worthwhile.typepad.com
Blog powered by Typepad