I've been meaning to write this post for some time. Scott Sumner's post spurs me to write it now.
First let me ask my dumb econometrics question. It's a very simple question, and I really ought to know the answer. But I don't.
Q1. If you estimate a linear regression, using Least Squares or whatever, will the estimated residuals always sum to zero? If the answer is "yes", then skip the next question. [Update: Ben, Norman, and Matthew tell me in comments the answer is "yes", provided there is a constant term in the regression, and I want there to be a constant term.]
Q2. If the answer is "no", would the gods of econometrics be very upset if an econometrician re-estimated that linear regression subject to the constraint that the estimated residuals sum to zero? If the answer to this question is "yes, very upset!", then you should probably stop reading this post, and try to explain to me why they would be very upset.
Q3. Has anyone ever done the following? I should probably know the answer to this question too, but I don't do micro public finance (with one exception where Frances was with me).
Get a random sample of Canadian (or whatever) individuals (or households). For each individual, get data on market income Y, and on net taxes (taxes minus transfer payments) T. Estimate a linear regression of T on Y, subject to the constraint, if necessary (see Q1 above), that the estimated residuals sum to zero.
The intercept of that regression (which would presumably be negative) tells us the Guaranteed Annual Income/Negative Income Tax we can afford, given a linear income tax rate equal to the slope of that regression, under the assumption that behaviour does not change. We know we can afford that GAI/NIT, because the residuals sum to zero by construction. For every dollar of positive residual where the deficit would increase with a GAI and linear tax, there must be a dollar of negative residual where the deficit would fall. The estimated tax/transfer system would be revenue-neutral by construction, provided behaviour did not change.
Now, will behaviour change? Almost certainly yes. Those who face a higher Marginal (net) Tax Rate under the new system will presumably choose to earn less than before, and those who face a lower MTR will presumably choose to earn more. But unless the first group has a systematically higher elasticity of response than the second group, the existence of the Laffer Curve tells us the net effect should be revenue-positive. (Jensen's Inequality, right?)
If I am right in the above, that simple linear regression should give us a conservative (i.e. lowball) estimate of the GAI we could afford to pay with a linear flat tax rate equal on average to the MTR we currently have.
The key to understanding GAI/Negative Income Tax is to understand that we already have a GAI. We call it "welfare". It's just a rather messy GAI, with lots of special cases and lots of very peculiar MTRs that are sometimes very high and sometimes very low.
OLS minimizes sum of squared errors. Sum of residuals not going to be zero except for special cases. You can force them to be zero, sure, but I don't know what that fit would be called. It is not OLS anyway.
As for GAI, you're right that GAI is like welfare--except you'd be cutting welfare cheques to way more people. The reason why GAI is an absolute nonstarter is that you would be paying out big cheques way up the income distribution. With a 25% phaseout and a 10k transfer, people earning up to 40k would get a cheque.
I dont understand why I see people discussing some slight negative disincentive effects of the tax back rate when the whole scheme is a moon made of cheese. The sooner the US policy bloggers actually try to run the numbers on this, the sooner we can move to actual feasible and helpful policy options.
Saying GAI is nifty because you can have low tax back rate and high initial payments is no different than saying we should hand out unicorns. It doesn't exist because it can't exist.
Posted by: Kevin Milligan | September 27, 2014 at 09:23 PM
Kevin: "Sum of residuals not going to be zero except for special cases."
Thanks! That answers my Q1. No.
"You can force them to be zero, sure, but I don't know what that fit would be called. It is not OLS anyway."
We could call it "constrained SR=0 OLS"?
"Saying GAI is nifty because you can have low tax back rate and high initial payments is no different than saying we should hand out unicorns."
Agreed. But I'm not talking about *low* tax back rate. I'm talking about a *constant* (net) tax (or tax back) rate across the income spectrum. Make everybody pay the same (marginal, net) tax rate, rather than have that tax rate fluctuate up and down with no rhyme or reason. Take exactly what we have now, and fit a straight line to it.
Posted by: Nick Rowe | September 27, 2014 at 09:55 PM
Kevin - you're talking about the sum of squared residuals. This is what OLS minimises, sure. Trivially this won't be zero unless you have zero errors. But I think Nick is talking about the sum of the residuals, not the sum of squared residuals.
In that case Nick, one of the implications of the OLS estimator is that the residuals sum to zero *if* you include a constant in your model. This comes directly from the normal equations.
I had begun to write out the proof of this, but it is a bit clumsy to put into a blog comment. I can add it if you like. But the intuition is clear - OLS is just fitting a line through some data. If you add a constant to your model this gives your line an arbitrary starting point, so the line will go directly through the middle of your data. With the constant, minimising the sum of squared errors is the same condition that sets the sum of the errors to zero.
That constant would be the constant you are interested in (I think).
Posted by: Ben J | September 27, 2014 at 10:23 PM
If you include an intercept term, then *yes*, the sum of the errors of an OLS regression will always be zero.
If you *could* include an intercept term (no linear combination of regressors always adds up to one), but choose not to, then in general the sum of errors in a general OLS regression will *not* sum to zero.
This is straight out of William H. Greene's Econometric Analysis. I believe the proofs are somewhere in the first three or four chapters.
Posted by: Norman | September 27, 2014 at 10:30 PM
The alternative way of thinking about this is that linear regression (with a constant included) is always exactly true on the averages. Plug in average Y and average T and the estimated regression equation is true, with zero error. This is an important property--linear regression gives valid local average treatment effect estimators even if the linear model is false.
Posted by: Matthew | September 27, 2014 at 10:38 PM
Ben, Norman, Matthew: aha! Thanks! My gut was telling me the constant term would be set to make the sum of residuals zero, but I didn't trust my gut.
Posted by: Nick Rowe | September 27, 2014 at 10:44 PM
This looks exactly right to me, Nick, though I think you mean "flat" rather than "linear" tax rate in the second to last paragraph. As I remember it, you made this case (in more general terms) in some comments on one of Stephen's posts a few years back, but without explicit reference to the regression intercept and the point about Jensen's inequality. This is an excellent and insightful post, but I'm looking forward to seeing if anyone is going to get your point any more than last time.
Posted by: K | September 28, 2014 at 12:21 AM
K: thanks. Yep, I'm slowly developing my thoughts on this. Edited so "linear" now replaced with "flat".
Posted by: Nick Rowe | September 28, 2014 at 12:48 AM
Nick, I was actually looking at a different question. I was considering whether a GAI that is acceptable to American liberals (providing a decent living standard) is affordable. Under our current system some people get almost nothing, and others get far more than average. Thus if everyone got the same amount, then many currently poor people who get welfare would see their benefits slashed sharply. That doesn't mean it's a bad idea, but it's a non-starter politically. So I tried to consider a system where the guaranteed income would be large enough to be acceptable to the Democrats. (Recall that the proposal envisions eliminating all other welfare programs, which is a hard sell.)
AFAIK, Welfare recipients in the US can only receive benefits for 2 consecutive years, and a total of 5 years lifetime. Switching to a system where they get the same monthly benefit for 60 consecutive years, and where it would also go to currently non-eligible groups like able-bodied single men, is far more costly. So the monthly benefits of current recipients would drop sharply.
I would favor a revenue neutral switch from the current system to a GAI, but I'd prefer a wage subsidy program over either.
Posted by: Scott Sumner | September 28, 2014 at 11:42 AM
Scott: yep. The US system is different. Clinton changed it, right? I always wondered how that would work out, after the 5 years had passed. Seemed to me there was a time-consistency problem. "I will support you now, but if you keep on doing that for 5 years, I will stop supporting you, I promise"
Posted by: Nick Rowe | September 28, 2014 at 11:56 AM
Whoops on the econometrics--shouldnt attempt late at night!
On the policy: Scott, I understand why low MTRs are attractive, but the direct consequences of having low clawback rates is that cheques go out to families in the thick part of the income distribution which balloons the cost astronomically.
You simply cannot do a balanced budget replacement of existing system with a GAI. The numbers don't come close to working. Choose any two of {balanced budget,low phaseout rate,nontrivial transfer amount}.
I am going to try to draw a picture of Nick's policy proposal--one problem I'm having is trying to understand how we are funding the rest of government if we are raising taxes just for this transfer.
Posted by: Kevin Milligan | September 28, 2014 at 03:57 PM
OK. I drew a picture. I think what Nick is saying is that the existing govt spending would still be funded but the GAI would be balance budget funded off the residuals. Is that right?
But here's what I don't get. My picture shows a cluster of high earners above the regresuon line. They would be getting a big tax cut as their MTR would drop a lot.
For people with zero earnings, they currently get big transfers. They would be way below the regression line. To move them up to the regression line you would have to drastically cut the transfer they receive.
So, unless I'm seeing this wrong, Nick's system would cut tax rates a whole bunch on the rich and fund that by cutting transfers to low income people. All so that we can have a flat MTR on income, which is an exogenous and hard to defend imposition to begin with. I don't get it at all.
High earners would face a big tax cut because they currently pay more than the fitted line. This would be funded by cutting drasticall
Posted by: Kevin Milligan | September 28, 2014 at 04:18 PM
(Whoops ignore last paragraph fragment)
Posted by: Kevin Milligan | September 28, 2014 at 04:19 PM
Kevin, Nick's point is a lot more general than that. If you don't like flat taxes, you can still fit a different tax structure, with some difficulty. But the general point stands - unneeded variability in MTRs can be smoothed out, leading to an overall improvement in policy. Real papers often do this kind of exercise by numerically computing the welfare-maximizing tax schedule for given assumptions about income distribution, marginal utility of post-tax income and labor supply elasticities (incentive effects). For reasonable assumptions, this usually leads to a basic income grant plus a U-shape for marginal tax rates, with rather high clawback rates tapering off to lower rates for the bulk of low-to-middle incomes, and a modestly progressive schedule for higher incomes.
Posted by: anon | September 28, 2014 at 07:01 PM
Kevin: "OK. I drew a picture. I think what Nick is saying is that the existing govt spending would still be funded but the GAI would be balance budget funded off the residuals. Is that right?"
Yes, I think that's right.
"My picture shows a cluster of high earners above the regresuon line. They would be getting a big tax cut as their MTR would drop a lot.
For people with zero earnings, they currently get big transfers. They would be way below the regression line."
If the high earners were above the regression line, and the low earners were below the regression line, that can't be. Unless the medium/high were below the regression line, and the medium/low were above the regression line. Otherwise the regression line would not be OLS, because a steeper line would fit better.
Posted by: Nick Rowe | September 28, 2014 at 07:20 PM
Hi anon. I recognize the tax system you speak of--I'm teaching optimal income taxation this morning! But that's pretty far from the GAI as normally presented, or Nick's version.
As part of the standard optimal tax scheme (with high tax back rates at low incomes--gasp!!) I understand the idea quite well.
Given that I understand it--specifically why MTRs for low earners are optimally high-ish--I don't understand the fascination with getting low MTRs for low-earners that seems to be a primary driver for GAI schemes. Sub 100% MTRs as a goal I understand. Sub 50% I don't.
I also now understand Scott Sumner's position better. He solves the GAI impossible trinity in a transparent way--he chooses low phaseout rate, balanced budget, and very low initial transfer. That fits his own preferences and that is great. I do wonder whether the median voter would be cool with chopping low-income transfers by, say, 2/3rds though.
Again, the primary fiscal impact of a GAI scheme is to pay out cash to a bunch of families in the low-mid part of the income distribution who are currently getting nothing. They get something under GAI because of low phaseout rates. With a transfer of $10K and a phaseout rate of 25%, a tax unit with $39K of income gets a cheque.
The reason why GAI-ers think it is so cool to increase other taxes by massive amounts to pay for their GAI scheme that hands out checks to middle income families is that they believe the low phaseout rate will have magical properties on low earners. But most evidence in this area (see Mirrlees report chapter by Saez and co-authors for example) emphasizes that low earners are most sensitive to AVERAGE tax rates; the participation margin not the intensive margin. That's why optimal tax schemes (like anon's above) feature high MTRs at low income levels. So, I see GAIers sending out big $$ to non-poverty families in order to achieve a benefit (low MTRs for low earners) that I think is next to non-existent.
Posted by: Kevin Milligan | September 29, 2014 at 11:52 AM
I should also add that I shouldn't be picking any kind of fight with Scott because his initial point is correct. That point being the vast unaffordability of the GAI scheme that US liberals typically push which assumes current transfers are maintained but just phased out more slowly. I think that point needs to be yelled from the policy rooftops.
GAIers have to start putting out numbers--and how they'd pay for their schemes.
Nick's scheme is self-financing I now understand, but again the Nick-GAI is a long way from the kind advocated by most GAI-ers.
Posted by: Kevin Milligan | September 29, 2014 at 11:57 AM
Just ran Nick's regression in the 2010 person SLID
This is going to format like crap, but here it is direct from STATA
. use $rawdata\slidpr2010, replace
.
. gen nicktax = marktinc - atinc
.
. reg nicktax marktinc [aw=weight]
(sum of wgt is 2.7479e+07)
Source | SS df MS Number of obs = 49787
-------------+------------------------------ F( 1, 49785) = .
Model | 1.1629e+13 1 1.1629e+13 Prob > F = 0.0000
Residual | 2.7535e+12 49785 55308393.2 R-squared = 0.8086
-------------+------------------------------ Adj R-squared = 0.8085
Total | 1.4383e+13 49786 288890211 Root MSE = 7437
------------------------------------------------------------------------------
nicktax | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
marktinc | .3279735 .0007153 458.54 0.000 .3265716 .3293754
_cons | -9774.974 40.8593 -239.24 0.000 -9855.059 -9694.89
------------------------------------------------------------------------------
. predict resid, residuals
. summ resid [aw=weight]
Variable | Obs Weight Mean Std. Dev. Min Max
-------------+-----------------------------------------------------------------
resid | 49787 27479219.3 .0000164 7436.887 -148718.7 261124.6
. summ resid [aw=weight] if marktinc>250000
Variable | Obs Weight Mean Std. Dev. Min Max
-------------+-----------------------------------------------------------------
resid | 205 149745.061 9952.089 40788.24 -148718.7 261124.6
. summ resid [aw=weight] if marktinc<10000
Variable | Obs Weight Mean Std. Dev. Min Max
-------------+-----------------------------------------------------------------
resid | 16311 9518628.76 1687.071 7729.03 -49631.39 111446.8
What all that says is that the Beta is 0.328 and the constant term is -9774.97.
For people with market income above 250K, there is an average tax cut of $9952.
For people with market income below 10K, there is an average tax cut (ie transfer increase) of $1687.
Posted by: Kevin Milligan | September 29, 2014 at 02:55 PM
...and here is the pattern of tax increases. (Negative means tax cut or transfer increase.) Each bin here is a $10,000 bin. 0 and under are grouped together. 200K+ are grouped together.
Tax cut at the top; transfer increase at the bottom. Paid for by tax increase in the middle.
Posted by: Kevin Milligan | September 29, 2014 at 03:06 PM
. gen bins = int(marktinc/10000)*10000
. replace bins = 0 if bins<0
(40 real changes made)
. replace bins = 200000 if bins>200000
(325 real changes made)
. gen taxincrease = resid*(-1)
. table bins [aw=weight], c(mean taxincrease)
--------------------------
bins | mean(taxinc~e)
----------+---------------
0 | -1687.071
10000 | 74.09914
20000 | 1023.011
30000 | 942.9042
40000 | 1160.213
50000 | 1243.446
60000 | 1530.154
70000 | 1416.816
80000 | 1840.204
90000 | 2156.299
100000 | 1605.932
110000 | 2274.23
120000 | 1221.592
130000 | 1688.97
140000 | 260.9917
150000 | 2795.322
160000 | -986.2052
170000 | -713.9641
180000 | 2565.521
190000 | -4292.895
200000 | -5490.246
--------------------------
Posted by: Kevin Milligan | September 29, 2014 at 03:07 PM
Kevin: Great! Thanks! And the results are more sensible than I expected! (I was afraid you would get a positive intercept, or something weird.)
So that means, **assuming no change in behaviour**, a GAI of around $10k per year per person, and a 33% flat income tax rate, is revenue neutral. I am surprised the tax rate is that low. I thought it would be nearer 40%-50%. And I thought the GAI would be a little higher. But those are two offsetting mistakes on my part.
Is your sample individuals, rather than households, and all adults, or does it include kids? What does SLID mean?
I assume the taxes do not include GST/PST/HST, and that transfers do not include transfer in kind.
Thanks for running that!
Posted by: Nick Rowe | September 29, 2014 at 04:48 PM
Actually, my biggest surprise, on seeing Kevin's results: I hadn't realised I was such a raving lefty. I could live with a $10k/33% GAI/tax rate. That sounds not unreasonable. But I think I would be prepared to pay a little bit more, to be a little bit more generous to the indigent, and raise both numbers a little bit. Noblesse oblige.
Posted by: Nick Rowe | September 29, 2014 at 05:10 PM
Hi Nick. This was individual data from the Survey of Labour and Income Dynamics, 2010. Age coverage is from 16 to 80.
Market income includes income from earnings, capital income, and RPPs. After tax income includes all refundable tax credits, less income taxes (but not HST/GST/sales taxes).
We all have our own preferences, but I'd be surprised if the median voter would like a scheme that raises her taxes by a couple of thousand in order to fund a $5490 tax cut to those earning 200K+ and an increase in the transfer to people with no market income of +$1687.
I was surprised at how linear the scatter plot was--I expected much more of a downward hook at very low earnings levels. But that's why we look at the data, right?
Posted by: Kevin Milligan | September 29, 2014 at 05:21 PM
Kevin: "But that's why we look at the data, right?"
Yep! I was expecting a more S-shaped curve. Initially high MRT, then low, then high again.
And yes, the median voter probably wouldn't go for it. But again, the data is telling me something (and something I ought to have predicted but didn't): the median voter model has got some explanatory power.
Posted by: Nick Rowe | September 29, 2014 at 05:36 PM
Kevin: "Age coverage is from 16 to 80."
Kids being excluded is probably the biggest problem with this estimate. Because child benefits are presumably being counted as transfers to the parent. One individual could scrape by on $10,000 per year, but one adult + 5 kids couldn't.
But still, it gives us some sort of ballpark for the numbers.
Posted by: Nick Rowe | September 29, 2014 at 05:44 PM
Hi Nick,
doesn't change substantially with the people under age 21 removed. I ran it in the Census Family file and got a similar tax rate; intercept term of around $15K.
Posted by: Kevin Milligan | September 29, 2014 at 07:18 PM
Kevin: so that's $15k per family? OK, sounds sort of reasonable, since a lot of those families will be single people, so that allows a number a lot bigger than $15k for families with kids.
Posted by: Nick Rowe | September 29, 2014 at 07:53 PM
> I hadn't realised I was such a raving lefty. I could live with a $10k/33% GAI/tax rate.
I hadn't realized I was such a right-wing loonie. I thought that a $10k GAI would be an idealistic, barely-achievable goal, itself requiring a greatly increased tax rate.
$10k/person is something of a magic number, as well. The low-income cutoff for a single person is around $15k, and at that point a household is supposed to be spending 2/3 of its income on food, shelter, and clothing. $10k in theory provides enough for the basic necessities of life.
Posted by: Majromax | September 30, 2014 at 10:09 AM
Majro: Yep. Amazing what we learn about ourselves, from the data!
Hmm. StatsCan tells me that average *household* size in Canada is 2.5. Divide Kevin's $15k per *family* by 2.5, and we get $6k per person (adults and kids treated the same) GAI, which is a bit too low.
But I wonder if "family" and "household" are the same?
Posted by: Nick Rowe | September 30, 2014 at 10:28 AM
> But I wonder if "family" and "household" are the same?
It works out if the average family/household has 1.5 adults and 1 kid.
Kevin's re-run with the removal of those under 21 doesn't change that the data won't account for children *not* in the labour force survey; it can only account for those 16+ who were surveyed. (That is, a 5-year-old is invisible with respect to the regression, but the child tax credit will show up.)
Posted by: Majromax | September 30, 2014 at 03:44 PM