How to Avoid Lying With Statistics (with Jeremy Weber)

Mar 4 2024

71gADoCp7UL._SY522_-194x300.jpg There's often a gap between the textbook treatment of statistics and the cookbook treatment--how to cook up the numbers when you're in the kitchen of the real world. Jeremy Weber of the University of Pittsburgh and the author of Statistics for Public Policy hopes his book can close that gap. He talks to EconTalk host Russ Roberts about how to use numbers thoughtfully and honestly.

LISTEN NOW:

Comment

●

READ TRANSCRIPT

●

DELVE DEEPER

DOWNLOAD

Time	Podcast Episode Highlights
0:37	Intro. [Recording date: February 1st, 2024.] Russ Roberts: Today is February 1st, 2024, and my guest is economist and author Jeremy Weber. He is the author of Statistics for Public Policy: A Practical Guide to Being Mostly Right or At Least Respectably Wrong, which is the topic of our conversation. Jeremy, welcome to EconTalk. Jeremy Weber: Thanks so much for having me. It's a privilege.
1:00	Russ Roberts: How did you come to write this book? Jeremy Weber: The book was in development in my head for probably more than a decade. It began after I spent four years working in the Federal Government, in a Federal statistic agency, the economic research service. And, that was a great place to be as a recent econ Ph.D. grad. And, it was a mix of and more academic research, very policy-oriented research, and generating real official federal government statistics, interacting with policy people. Then I went into academia to teach statistics to policy students. And, the book I was using, the course that I inherited, very quickly I had the feeling I was more or less wasting students' time, or at the very least there were huge gaps such that when they left my class, they weren't going to be prepared to use any of this to help anyone in a practical setting. And, from that point on, I started to accumulate notes on things that, if I were to write a book, I would want to include and things that I was now using to complement the statistics textbook to give my students more. And then, in 2019, I spent a year and a half at the Council of Economic Advisers [CEA] and that was like a accelerator for this whole idea. Because, being engrossed in that environment, gave me many examples, many ideas. And then when I came back to the University of Pittsburgh and had a sabbatical, I said I've got to write this. Russ Roberts: What is its purpose and who is the audience? Jeremy Weber: Yeah. I'll start with the audience. The audience is broad, because frankly, whether it's your first statistics class or your fifth, many of the issues are the same and neither the intro nor the advanced tends to do some things well. In particular, the communication of statistics to a non-academic audience, the integration of context and purpose of the moment or of the organization or of the audience into what you're presenting--its significance for the situation at hand--we tend to not do that well, I think, at the undergraduate level. Or for Ph.D.s who are in their fifth year of econometrics. So it's--the audience is broad.
3:50	Russ Roberts: So, it's a very short book. There are a couple of equations, but there as--kind of like illustrations. And, what is spectacular about the book I would say--and I would recommend it to non-technical readers--what is very powerfully and well done about the book is giving the reader who is not an econometrics grad student, a very clear basic understanding of terms that you've heard all the time out in the world from journalists and occasionally a website you might visit that highlights academic research. So, you'll learn what a standard error is, you'll learn what a confidence interval is. But, it's not a statistics textbook in that sense. However, those--that jargon--and other concepts that are used widely in statistics are very intimidating, I think, for non-academics. And your book does an excellent job of making them accessible. And then, of course it goes well beyond that. You're trying to give people the flavor of how to use these concepts, use data that's produced in all kinds of ranges of applications, calculation of means and correlations up through regression results that is more sophisticated. Statistical analysis. You're going to give people insights in how to use them thoughtfully. And, as you point out, no one teaches you how to do that in graduate school or in undergraduate if you take statistics. They're taught more as, I would say, a cooking class. You learned to add certain ingredients together. If you want to make a cake, you need flour and you need eggs and you need this and a certain amount of heat. Whether it's going to be a good cake or not is a different question. Whether that cake belongs to a certain kind of meal or a different meal, those are the things that practitioners learn if they're lucky. But, you're not taught those things. And certainly people who don't go to graduate school or don't take a number of statistics classes in college will never, ever have any idea about it. So, I just want to recommend the book. If those kind of ideas appeal to you, you'll enjoy this book and it will be useful to you. Is that a fair assessment? Jeremy Weber: That's a very fair assessment. You use the cooking example. I allude to kind of a vocational example in the book, where our statistics education, I would say teaches--it shows you: Here is the saw. And: Here are the parts of the saw. And, maybe we even, like, start it. And then, we put it down and we move on to another tool. Or, maybe we work with 10 different types of souped-up chainsaws, really sophisticated chainsaws. But, we're just like, these are again, the features and parts of the chainsaw. Actually going out and cutting down trees, like, do that--we don't do that. That is--we don't do that. We know people do that, but we're not doing that. And, that's a bit of the gap I'm trying to fill.
7:07	Russ Roberts: And, the more standard metaphor you also use is the hammer. And, we may come back to this, but of course the standard, the cliché'd condemnation of mindless statistical education is: Once you have a hammer, everything looks like a nail. And, it's really fun to run regressions and do statistical analyses once you understand how basic statistical packages work, without wondering whether it's a good idea, what's the implication of the analysis, how reliable is it, and does it answer questions as opposed to just provide ammunition for various armies in the policy battle? And I think for me, that's one of my concerns. We'll come back and talk to it later I hope in terms of how we should think about the education in the practice of statistics. But, it's such a fun tool. It's a lot more fun than a hammer. It is more like a chainsaw. It's noisy and attracts attention and people like to cut down trees. So, there is a certain danger to it that your book highlights--in a very polite way--but, I think there's a danger to it. You can respond to that. Jeremy Weber: Yeah. It is fun until it's not. And, when it's not is when you are using this regression tool and you've maybe used it with the academic crowd; and that was fun. But then, you go to another crowd--the City Council crowd or some sort of more non-academic crowd--and you present it; and suddenly it's not fun because nobody knows what you're talking about and the conversation quickly moves on and you feel, like, out of place. Fish out of water. You've miscommunicated. People are confused. And now they're ignoring you. Russ Roberts: But of course, the flip side also occurs, right? The scientist in the white coat. And, in this case it's the economist or policy analyst armed with Greek letters in their appendix. At least in their paper if not their physical one. And, there's an awe of these kinds of people: 'And, obviously they're smarter than I am and obviously they're experts. Maybe I'm overly pessimistic here.' A lot of times I feel like in those settings outside of academic life, there's a lot of trust in the reliability of numbers produced with what I would call standard practice. And, once you follow the rules of standard practice--which means statistical significance, confidence intervals and so on, and you frame your work with those footnotes, then you're credible. And just simply because you're in the arena and you've been trained accordingly, you're a bit of a shaman. And, I think that's a little bit dangerous. As is the opposite: 'Well, they're obviously wrong. They are a bunch of academic eggheads and they don't know what they're talking about.' So, I think there's an interesting challenge there, I think, when we go out into the world. Jeremy Weber: Yeah. You're right. In certain environments there's that deference, that credibility conferred because of the mathiness, because of the training, the aura. I agree: That is a case that does happen in certain environments.
10:60	Russ Roberts: Now, I argued in a recent episode that statistical analysis is used more for weaponry than truth-seeking in the political process. And, I think it was misunderstood by some listeners. I think it's very useful to politicians to have data numbers and policy players. But I don't think they're so interested in the truth, and I wonder how your book would be perceived by them. Jeremy Weber: Yeah. I agree with your assessment. Primarily weaponry, especially in the D.C. [Washington, D.C.] area. But, if the weapons being picked up are actually real, understood measurements that accurately reflect an issue--they don't reflect the full scope. They're being used selectively. But if there's good measurements out there and there are competing parties fighting, it means the party is going to pick up the most effective weapon that most appeals to the audience out there. And so, if there are, in a way, better weapons out there that can be picked up, I think you have a greater tendency to some major problems being avoided or opportunities pursued. And I'll give you a concrete example. When I was in the White House, the commerce department was petitioned by some uranium mining producers for protection. They didn't want imported uranium into the United States. The Commerce Department conducted an investigation, did a Report on the issue. They did their own--they did a survey. They presented some statistics in this Report that went to the President recommending restrictions on imports. Okay? You know. So, Commerce Department, they've got their weapon. All right? And, CEA got involved-- Russ Roberts: The Council of Economic Advisers-- Jeremy Weber: That's right. Council of Economic Advisers got involved. I grabbed some other data. I did some analysis. I generated, you could say, another weapon that I thought was actually a better depiction or reflection of the economic reality and what was likely to happen under the Commerce [Department of Commerce] proposal. All right. So, we got together Commerce, other agencies in the room, and in a way we had our battle. We picked up our weapons. I think we--at the end of the day--we ended up at a better place because I was able to pick up a weapon and there was this back-and-forth with the data. So, but, had CEA not been there, nobody or those reports that I relied on from the Energy Information Administration had that not been there, everybody would have bowed down to Commerce and they would have rolled right through, and the President would have said, we've got to import or restrict uranium imports so we can prop up these several producers out in Utah--at the expense of the nuclear power industry and electricity consumers. Russ Roberts: Yeah. That's a great example. In theory, the Council of Economic Advisers--and I think it to some extent plays its role as best as I can understand it--is more of a technocratic fact-checker in some dimension of advocacy by other agencies and in theory is somewhat unbiased. In this case I assume the argument was that this was going to create a lot of jobs in Utah. Was it Utah? Jeremy Weber: Yeah. There's several places where uranium mining occurs. Utah is one of them. That's where the companies--at least one of the companies that was filing the petition was located. You had the argument--there was a national security argument. There was a whole resiliency of the uranium supply chain argument. There was jobs argument, too. That's right. Russ Roberts: If I can ask, what was the key finding that you felt was at least somewhat decisive in derailing a strong impulse toward restrictions on imports? Jeremy Weber: The key finding was: Commerce Department had said at $55 a pound, we think the domestic uranium sector is going to produce the amount that we're going to require domestic nuclear power producers to purchase from them. So, this is not going to increase domestic prices for uranium much. $55 is just a little bit above what the market price was at the time. My analysis said: Unlikely. The price on the domestic market will be a lot higher. For the uranium sector in the United States to produce six million pounds of uranium--which is what the requirement was going to be, the buy-American requirement, so to speak--you are going to need a dramatically higher price. And, that price is either going to get passed on to consumers--especially in places with a lot of nuclear-generated electricity--or, the nuclear producers were going to eat it and it was going to push some of them over the edge, particularly in places like Michigan and Pennsylvania where there's--and Ohio. Important states. And, so my basic argument was: this restriction is going to call[?cause?] prices to increase a lot. And the key statistic--I did some more sophisticated stuff in the background, but I didn't bring that in as the Game Plan A in the meeting. I brought in two figures--two graphs--that simply showed, look, in recent history, the price of uranium has been way above your price for several years and the domestic sector didn't come anywhere close to producing what you're saying they would produce now at a much lower price. So, this simple descriptive statistic I think convinced everyone in the room except Commerce that there's no way the domestic price is going to be $55 and six million pounds are going to come out of the ground. So, in the Decision Memo to the President, our estimate, CEA estimate of what the price is going to do, what it's going to do to electricity prices and electricity consumers was in there. And, I don't know what proportion of influence that had, but people who were familiar with the matter said it was really important that that point was made. Russ Roberts: For economics majors out there, this was a debate about the elasticity of supply. A phrase that I don't know if it's been uttered more than a couple of times in the history of this program. Meaning how responsive is production to changes in price? And, if the answer is not very much, then you're going to need a much larger price to make the market work effectively; and the demand for uranium once foreign supplies are unavailable is going to push the domestic price up much higher than $55. And, that's very nice. Now of course that as you point out--you had more sophisticated stuff in the appendix. But, the fundamental--often the facts can be persuasive or at least provocative [?] reconsider a position.
18:46	Russ Roberts: What are your thoughts on our profession generally and our ability to establish something like a truth on the basis of statistical analysis? So, for example, what's the effect of the minimum wage on employment among, say, low-skilled labor? The profession used to believe the answer was minimum wage is very bad for low-skilled labor. It would cause a lot of jobs to go away. In recent years, there've been a lot of thoughtful people who've made the opposite argument: it's effects are either small or zero. There's been pushback against that by other people saying actually that's wrong: In the short run it might be true; in the long run, it's big; or, you didn't fully measure it correctly. And, if I said to an economist: 'What's the effect of a, say, 25% increase or 15% increase in the minimum wage?' it would depend on who you asked. And, that's weird. If you ask a physicist what the effect of gravity is, they don't argue about it. There's a consensus. We don't really have those kind of consences--I don't know if that's the right word--consensuses in economics, it seems to me. Do you agree? Jeremy Weber: I agree. And, I think the key difference is that, as long as we're in the earth realm, gravity is pretty contextless. It's not context-dependent. Social settings are so varied, and so the situations in which we estimate these relationships are oftentimes conditioned by the moment in history, the place. And, I'll give you a concrete example. My subject area of expertise is energy and environment. I've done work on fossil fuel extraction, effects on communities. A big question that the literature was considering several years ago was if you have fracking--if you have natural gas drilling in an area--what happens to property values? Somebody looked at that question in Pennsylvania and found, well, for many homes it will be a negative effect. People are not going to want to live near this, particularly homes dependent on groundwater. Okay? That's Pennsylvania. I looked at Texas. I found housing prices go up quite a bit in the vicinity where the fracking took off. Well, the reason for the difference was--or primary reason--in Texas you tax natural gas wells as property. So, when you drill a well, the full value of that well enters the tax base. That's like we just built a bunch of million dollar McMansions and now those people are paying taxes on those houses. That's going to the local government. That's going to the schools. It turns out in Texas then they lowered the property tax rate and so people's tax bills were going down, the school was getting more revenue, and property values generally went up in the area. That's a very different finding than in Pennsylvania where they don't tax. There's no revenue generation for the school, no reduction in property taxes. Same basic phenomena of fracking, fundamentally different effects on this outcome because the context--the policy environment--was so different. And, I think that's just one illustration of how--are we raising the minimum wage in an area where the market wage is already pretty high and we're just going to basically move it close to the market wage? Or, not? So, I think part of the reason why it's hard to come up with a consensus is because context matters; and that matters certainly for policy. That consideration of context is something that I emphasize in the book so much.
22:40	Russ Roberts: Of course, you would like to think that the fundamental market forces are the same. They may be, in, say, the case of minimum wage. And, people might disagree about what those are. That was, again, not so true I think in the past--say, 50 years ago--but is much more true, say, in the last 15 or 20 years. But, I do think there is a feeling among younger economists and I would--Jeremy, I put you in that group relative to me. Just looking at you, I would say-- Jeremy Weber: I appreciate it-- Russ Roberts: No problem. I think there's a concensus--not a consensus--there's a flavor of recently-trained economist who says: 'I don't look at theory, like what theory says about what the minimum wage impact should be or is likely to be. I just look at the data. I just read the output from my statistical package, and I look for the truth, and whatever it says, that's our best understanding of how the minimum wage affects low-skilled labor at this point in the areas I looked at it.' And, I find that an untenable view, but I think I'm in the minority in the modern world. Is that true? Jeremy Weber: No. I agree with your assessment that there is a tendency, culture shifting or it has shifted, where we just want to go right to the data and not do the heavy thinking beforehand about setting things up, in a way: What are we trying to answer? What is the general theory that we're trying to test? And, we're just going into the data too quickly. And, one of my recommendations in the regression chapter is: never run a regression without a clear purpose for doing so. It is so easy to be led in strange places just by kind of meandering through the data. And, you know, we all know we're not supposed to look for certain results, but that is so easy to do. You start getting a hunch: 'Oh, this would be a great story if it works out this certain way.' And, lo and behold, then you start looking for that story and you're like, 'Oh, it doesn't work quite right here, but what if I subset the data this way?' and suddenly the story emerges and then at the end of the day you're, like, 'Well, I can sell this story.' Like, 'This is coherent enough.' But is it a manufactured story? And, I think that does happen more often than it probably should. Russ Roberts: I remember hearing from George Stigler, who was a professor of mine at the University of Chicago, that in his day there were no statistical packages. I don't even think they had punch cards and computer analysis. They had fancy calculators. The kind of calculators that were used, say, in the Manhattan Project. And, they would have a handful of variables--because they didn't have the amount of data we have now. This is like, say, the 1940s and 1950s. And, he said you would decide one or two things you'd run a regression on and it would take a long time and lots of calculations and then you'd find out what that answer was. And, because you were only going to do a couple, you thought very long and hard about what belonged in the analysis and what didn't. And when you were done, that's what you found. And, if you didn't like it, you had to then decide what you were going to do with that. And, the answer wasn't: Well, I'll run another 30 or 40 until I find something more amenable to my preexisting views. But, I think the real difference in the modern world, we not only have--you can run a regression in a fraction of a second--we have immense amounts of data. And, because we have so many different variables and different ways of manipulating them, you do have to have some kind of theory as to how you're going to do that wandering through the data you're talking about. And, in particular, otherwise--I'm going to say it differently. Even though we have lots of variables and lots of data points, we're not close to having all the data on everything that's relevant for the decision. We don't have data on people's moods. We don't have data on their childhoods, and how they were raised, their genetics. It's so many variables obviously that could be important. And so, we pretend we have all the data. And then it's just a question of throwing out some of it that we don't think is relevant. But, we always have in the back of our mind this haunting ghost that--that almost by definition in a social science perspective, you can't have all the data. But that's unpleasant. That's no fun. I want to be in the sandbox. So I pretend I've got enough. And, I think that's the danger of theory-free exploration, because you don't have enough and you're prone to your own biases--confirmation bias and other things. Jeremy Weber: There is that danger of maybe we constructed a narrative that it's just somewhat disconnected from reality because it's only found through torturing the data. I think there's another danger, and it's similar to--there's this book called The Shallows: how the internet is making us think more shallow,-- Russ Roberts: Nicholas Carr-- Jeremy Weber: superficial. Yeah. I think there's a parallel in the data or the statistics world. I was recently talking with somebody involved in a data science program and they brought in employers of their graduates and said, 'What's your assessment of the skills and how we're equipping our students?' And, they said, 'Look, we love their ability to manipulate data, calculate stuff. They can't [?] tell us the meaning of these things they're calculating for our purpose, for our organization. So it's like, it's like: Data, data everywhere, but understanding is nowhere. And, that's very easy to do when at a click, you've got oodles of data, oodles of regressions you can run. You're not slowing down and doing what Stigler had to do. And, that is, like, think twice, or measure it twice, cut[?] one sort of thing. You just run right into it. And, that's what I stress for my students. I said, 'Before you touch the data, stop. Think.' This is something I'm trying to get across in the book. It's just: Slow down and think about what you're trying to accomplish. What is the problem? And then, with that clarity of understanding, you go and you learn something from the data. Then step back. Get away from the data for a minute and reflect for a few days on what you have calculated. All right? And go forth this way so that you have a deeper meaning. There's greater understanding being created by what's being calculated rather than 'I calculated a bunch of stuff and I'm reporting measurements.' And, I think people will find it interesting.
30:23	Russ Roberts: Let's go back to the chainsaw analogy. I think it's useful. Also makes me think of the current President of Argentina who liked to campaign--I don't know why it became his campaign image. He may be the most sophisticated economic thinker in office at a high level in the world right now. He's a pretty good economist, at least in terms of explication in the video clips I've seen of him. But, it's fun to use a chainsaw--in theory. I've never used one, by the way; but I can see the appeal of it. So, imagine if I were going to have a course in how to use a chainsaw and I said. So, here's where the gasoline goes. Make sure you close the cap well and we'll practice that. And, here's how you turn it on. In the old days you pulled a cord and it was this exciting causal connection, and a loud noise results. And, here's how to make sure you don't cut off your fingers, because it's a dangerous weapon. Or yourself. But, over there some trees. Have at it. And, that would be a weird course, because the students would say, 'Well, I don't really know how to cut down the trees thoughtfully or carefully. And I'm worried that if I cut them down the wrong way they will fall on innocent bystanders in houses.' And, it's a strange thing that, that's the response of the employers of students who study data science. It's like: 'Wow, they're really good at.' Meaning what? They're really good at turning on the saw and they're really good at refilling the gasoline and refueling it. But, they're not really good at making sure that houses don't get crushed. Something is wrong with this picture. Jeremy Weber: Exactly. In fact, I'm experiencing this just now. I'm running a capstone class here at the Graduate School of Public and International Affairs, part of our Master's program. We serve a client--City of Pittsburgh. It's about our deer overpopulation issue. And, I have a student who--great student--we sent her to look at the data on deer incidents, deer-related police reports on a neighboring municipality that had implemented a management program. She's getting the data, she's looking at it; and she stopped and she says, 'I think maybe somebody else should take over for me because I'm not sure I'm the best person to learn from this data, calculate this stuff.' And I said, 'Well, what classes have you taken?' Well, she's taken two graduate level statistics courses. She's looking at the chainsaw, she's seeing the tree, and she's, like, 'I'm going to step away.' Russ Roberts: Good for her. Jeremy Weber: I'm like, 'You are exactly the right person to be doing what you're doing. Let's do this together. Let's start here. Let's take one step at a time.' So, it's going to be great for her, it's going to be great for us. But, I'm somewhat appalled that she's taken two--a year's worth--of statistical education, postgraduate, and she doesn't have the confidence to pick up the chainsaw. And I somewhat don't blame her. I understand it. It is intimidating to walk into the room. It's intimidating to take up that chainsaw and start laying it into that massive oak. And, the equivalent in reality is then stepping into the room with a City Council person and reporting these numbers that then the staff are going to look into, maybe a journalist looks into; maybe somebody gets the data and shows that you did something stupid and you end up looking like a fool. And so, the students are averse to doing that.
34:19	Russ Roberts: I think there's two things going on here, though, at the same time. Theoretical programs in law, or business--sometimes students and outsiders, employers will complain: Well, they taught you legal theory but they don't teach you how to be a lawyer. Because to be a lawyer you have to learn how to read the client. You have to understand when you need to push back against the client's demands. You need to understand how to read a jury. You need to know how negotiate in a settlement question. And, law schools don't do that. Same for business schools. Business schools teach theory of finance, theory of marketing, and so on. But, when push comes to shove, only life gives you the education you need in the trenches and in the real world. And, I think what's going on here--and your students are a special case. But, I think most of the time the things you're taught in graduate school in statistics or data science are really everything you do need to know about how to use the chainsaw. It's that you don't know how to use it thoughtfully. And, that's an entirely different thing than you don't know how to use it in the real world. You can use it in the real world really well. You can put up a big set of tables and charts and appendices, but there's no thoughtfulness to it. So, I think--when I taught in a Master's Program at George Mason, I taught a class called How to Think About Numbers. And, that was for me the things that students weren't getting in a cookbook econometrics or a statistics class. But in general, those classes aren't taught. And, the kind of things you teach in your book are not taught. And, my question is: Why do you think that is? Why do you think the world wants somebody--because I really think that employer often wants somebody who is really good using the chainsaw and they don't care where the trees fall. They just want a really sophisticated user of the saw. And it's kind of strange that is the way it seems to me that the world works. Jeremy Weber: I think that you're absolutely right. I think the reason they're not taught--it's not taught--that sort of more careful thinking about the numbers and those practical issues--is: who is teaching--who teaches statistics courses in university, undergrad or grad? They are academics. Okay. How do those academics use their statistical skills? They use it in research articles to academic audiences. They are then attuned to what is the editor and the reviewers--what are they going to go harp on? What will be the bars that I have to pass? What are they going to scrutinize? All right. That is what they're doing, nine to five. Then when they go to teach, they are teaching students to do what they do, by and large. Okay? And, the unfortunate thing is: It is as if they are teaching students, 'This is how you speak to and relate to this tribe.' Okay? This Swahili tribe. Well, then the students go out and the majority of them are not going to be writing academic articles dealing with reviewers. They're going to be using these for employers, businesses, or nonprofits, or city council. And that's a different tribe. And then we're surprised, 'Oh, they're speaking Swahili to the Germans here and there's miscommunication or not understanding that's being conveyed.' And so, the emphasis is off because the people teaching are accustomed to speaking to a different tribe. So, they're spending all of this time on three different ways to refine your standard air calculation because the reviewers are going to ask about that. They're spending all this time on refinements to identification because they know they're going to get nailed on that. And so, they convey that to their students. And they're not spending the time on, just: Are the data good? Like: What do your variables mean? A one-unit increase in X--like, what is that? Is that big? Is that large? Is that small? And, don't tell me statistical significance or not. Like, I want to know: is a 0.5 increase in that thing, should I care? We're not teaching that for the most part. Russ Roberts: I'm the president of a college. We're small. We're hoping to add an economic/public policy major in the coming years. And, if we do that while I'm here, it will emphasize the challenges and limitations of the chainsaw--of statistical analyses alongside the hammer/chainsaw part, which is you got to have that if you're going to enter the battle. If you're going to be in the arena, you better understand how your opponent's weapons work. If you're going to claim they're dangerous or they don't work well or they're inaccurate, you need to understand how to use those weapons and then explain the limitations. And, part of me says--the romantic part of me--says: Well, this will be good for the country, for the state of Israel because there will be a set of bright, articulate people with a grounding and philosophy, in addition, but also who understand the limitations of statistical analysis. The question is, is anybody going to hire those people? This is a different way to look at this question of this mismatch. Does anybody want someone who is going to always remind them that this finding which they want to wave around and print on a big banner might not be true? Jeremy Weber: I'm going to say a guarded optimistically yes. Russ Roberts: Good. Jeremy Weber: When I teach policy analysis--so I teach these capstone classes that are more client-oriented--I use the analogy of a lawyer and a client. And I say, 'Look: we are lawyers. We are like in a lawyer-client relationship. Clients are not well served by lawyers who just cherry-pick things in their communication with the client.' Now, there are two different levels of interaction here. There's going before the judge and the jury. Which the lawyer is not going to present damaging things to that audience. But, I would think that the client wants and is best served by a lawyer who in the private confidence of the lawyer-client relationship is shooting straight. Fully understands weaknesses of legal theories, the strength and limitations of the evidence, and implications for the client's case. And so then the client, knowing that--okay, it's not the client that needs to be convinced, it's whoever the client is turn arounding and serving or speaking to, that needs to be equipped well. So, I do think clients want lawyers who are not fly-by-night known for inventing things, cherry-picking things. They want people who will shoot it straight with them so they can then turn around and make better decisions that are going to be more persuasive, more bulletproof to the audience they're working with. The Chairman of the Council of Economic Advisers would not be well-served by me being a data lackey and just saying, 'Oh, Kevin, I think you wanted this number. It sounds good. Go run with it.' Because Kevin [Kevin Hassett, CEA Chairman, 2017-2019--Econlib Ed.] is going to turn around and go to a meeting with thoughtful, sometimes aggressive people who are going to find holes in it and potentially make Kevin look very bad. So, Kevin actually wants somebody who is a straight shooter and not a data lackey. Russ Roberts: I love that.
42:50	Russ Roberts: Let's take an example from the book which I very much enjoyed and hadn't thought of this way. I thought it was really great. You talk about the well-known idea--hard to remember, surprisingly hard to remember--that correlation is not causation. It's well known but remarkable how many times people either forget it or want to forget it. So, it's a great point. But you make a deeper point, and I think it is quite profound and very, very rarely thought of, which is: You really should be thinking of correlation and the magnitude of causation. So, there could be a correlation. Sometimes it is causation. But, that's not the only thing we care about. In fact, almost always we care about the magnitude of the impact, not just that they're correlated. And, the reason that matters--you say that, 'Oh yeah, sure, sure, sure, that makes sense.' But the point you make, which is fantastic, is that in the real world, the world we live in as opposed to the textbook, there's more than one thing changing at the same time. So, past data that we look at to examine relationships is of course affected by more than the thing I'm looking at that I'm calling the causal factor. So, you give the example of your storm drain. Why don't you share that and generalize it to other issues? Jeremy Weber: Yeah. And, I use this example in my classes as well. A few years ago, my drain in the back of my house was overwhelmed in a storm and it flooded my basement. This happened twice. And, my neighbor learned about it and she quickly said, 'Well, this is climate change. Clearly. We're having more intense storms,' which suggests that the problem is the quantity of water hitting the drain. And, I didn't think too deeply about it. I kind of wanted to get rid of some asphalt anyway. And so, accepting this premise that the storm had been more severe and that's why it flooded recently and not in prior years, I went then and rented a concrete saw and dug up a bunch of asphalt and replaced it with grass so the water could percolate down and the drain would drain a smaller area. Then the drain flooded again. And clearly, water falling and my drain flooding, these are causally related. Intuitively there's a connection. But, what was the main reason why my drain wasn't able to handle this water? It turns out it wasn't because the storms were more severe than they had been in the past. And, it took a conversation with a plumber visiting my neighbors. Said look, 'You don't understand how your drain works. It's draining in this other direction. There's a kink here. If you go into your garage, you're going to find an access point.' I pulled that out and there's some mud clogging it, clogging the drain. I pulled that out. It's never had any issues since then. We've had a tremendous storm since then. The main cause was the clog in the drain. And, tying it in with climate change and the storm intensity, that was a distraction. Yes. It might be true. I didn't go and look at the data. It might be true that those storms were in fact a bit more severe. And, it might be true that that was driven by rising greenhouse gas pathogens. Might be true. All right? But, we could have solved climate change completely, and my drain would still have been overwhelmed at the next storm. It was a secondary/tertiary issue. The primary issue--the primary causal factor--was the clog in the drain. And without understanding that, I was just going to be throwing money and effort at a tertiary issue that wasn't going to solve the problem. Russ Roberts: And, that's just so common in policy arguments. Of course [?]. Correlation--not causation--is the English version of a more pretentious Latin phrase, which I always loved, post hoc ergo propter hoc--'after this, therefore, because of this.' And, your point is that: Yeah. After this sometimes is because of this. What happened after is because of this thing that happened before. But, eight other things happened along the way. And, the fundamental question isn't whether this one affects that one, but by how much relative to the others. And so many policy debates are about--again, going back to maybe the first EconTalk episode with Don Cox. I think we talked about this, what he calls the 'dreaded third thing.' You have two variables, one affects the other. There is that third thing that--it's actually on a later episode with Don. Or some essay he wrote. It's not the first one--that was on parenting. But, the dreaded third thing is that the world is complicated. There's actually more than three. There's the dreaded third, fourth, and fifth thing. And, the fundamental question is if you want to affect the variable that you're caring about, is it the one you're focused on or is it the third, fourth, and fifth one that have the bigger bang for the buck? Statistics can help you answer that, but you do have to keep it in mind and to look for it. Jeremy Weber: And, unfortunately, our statistics culture with the emphasis on statistical significance is usually focused on that question: is there any causal relationship at all? Is the coefficient zero or not? And, as my chapter--I think one of the most important chapters in the book is: Know large from small and explain the difference. Is that we're so used to using statistical significance as a crutch for saying: Is this important? Does it matter? And, the reality is, like you said: Ten things are probably causally related to this outcome we care about, but for policy purposes, we obviously want to prioritize. We're not going to make much progress on the problem if we're focused on this fourth-order issue that yes, is causally related, but the magnitude is so small. There's an issue here with political speech that's really tricky that I just want to point out--I faced it a lot in reviewing speeches by White House officials--where you have two or three things presented together as equivalent contributors or causal factors. One or two of which might've been Administration-driven, and maybe they don't even mention other factors. And, the reality is: yes, all these things are causally related, but--I'll give you the concrete example. The rising oil production in 2018 and 2019 was primarily driven by higher energy prices. Did deregulatory efforts help? It certainly didn't hurt. And, it probably--intuitively it would be causally related. But, if you speak of those two factors in the same sentence and you're going to communicate to the audience: they're equally responsible for this rise in U.S. energy independence and so on, when they're not. 95% of it was just producers who are responding to price. Russ Roberts: Yeah.
50:21	Russ Roberts: The other example that I think about a lot is--and of course as you say, political speech, a lot of times things get emphasized not because of their magnitude, but because of their salience in the minds of voters and others. One I think about a lot, a friend of mine talks about--he may be listening to this episode. But, he will argue that that trade with China is the source of many of our cultural malaise--much of our cultural malaise, many of our cultural problems. I'm not convinced of that. In fact, I'm pretty sure it's not true. Whether it's true or not, it has a very strong political impact when people hear that. The reality, I think, is that there are many, many things going on, many at the same time. It's hard to know whether those things are independent of each other. Some could be caused by the economic challenges that certain parts of the country face in response to trade with China. But, certainly using China as a source of fear--Chinese trade--is very powerful and very effective. Whether it's true or not is much, much harder to establish. And, in particular, it could be true but the magnitude is quite small relative to the other factors. But, as a politician, often that will be irrelevant. It'll be invoked simply because it's effective. Jeremy Weber: This is very true, and this happens. I think this is a good moment to make a point that I make upfront in the book. And, it speaks to some of your pessimism around data and our ability to untangle things and so on. There are two camps of people, I find. Generally those people who are data enthusiasts: We're going to be able to solve the issues, identify the priorities, the results disputes if we just let the data speak. We look at the evidence, we do evidence-based policymaking, this is it. And then there's another group that says: There are 'lies, damned lies, and statistics.' Like: It's a tool for manipulation--as you say, it's just weaponry to shoot at people. It's worse than not helpful. It's distracting, misleading, and so on. And, I speak to both of them. And, an important point I make to the 'you can say anything with data'-crowd is: Statistical claims are with us always. We cannot help but make statements--claims--about what is common, what is general, what is causing X versus Y. We will make those statements. It's better if we tether them to actual observation. Because, we're going to make them: The politician is going to make them. The nonprofit leader is going to make these statements about what's generally the case. And statistics at least helps constrain us somewhat. But, those claims are going to be made anyway. And if there's a culture there, a habit of good use of statistics, it's at least a constraining power on specious claims about what is generally the case, what is rare, what is common, and so on. Russ Roberts: Yeah. I really like that part of the book. And I just want to say that if you had to pick a religion for me where one religion is 'Data analysis reveals the truth' and the other religion is 'Lies, damned lies, and statistics,' I would be in the latter. I would be in the 'lies, damned lies, and statistics'-group. And, a lot of people then conclude: Oh, obviously I'm not a scientist. I'm irrational. I believe in going with your gut. The only reason I push the 'lies, damned lies, statistics'-scripture is because most of the religious enthusiasm is at the other end. I'm actually in a third camp. I'm in the camp that worships the idea that it's complicated. And that reality is beloved by neither the 'statistics reveals the truth'- and the 'lies, damned lies, and statistics'-group. So, in reality, I actually am in this third group. But I think I am often misunderstood as being in the second. But I do think, just for the record, there's nothing worse than anecdotes. They're dangerous--as are statistics misused. So, it's complicated. Jeremy Weber: Yeah. And, I think I probably--I mean, I think I'm where you are, Russ, generally. And by speaking to the, 'Let's just get in the data-driven car; it's going to tell us where to go'-people, and the, you know, 'statistics are just manipulation,'--by speaking to both of them and the access or limitations of the extreme, I think the result is you would end up somewhere in the middle. Okay. We can't stop trying to tether our claims to summaries of observations. Not just what your brother's cousin said--and that's your one data point and then you extrapolate. But rather, what many people have said or what's been measured in many places or moments.
56:01	Russ Roberts: There's a lengthy discussion in the book relative to what I would have expected on fact-checking. And in particular based on your experience in the Trump White House. Describe how thorough that was, and why that was, and how you felt about it. Jeremy Weber: So, one of the first things I learned in arriving at CEA was they have a fact-checking process. And, the Chairman at the time, Kevin Hassett, was insistent on before anything reached him, it should be fact-checked. Before anything--certainly anything that left CEA--needed to be fact-checked. And by that, what was meant was the original author of the Memo, the facts in it would have to pass it off to somebody who was not involved in it, and they go through all of the factual claims and verify them. And, sometimes that was very simple. That was just: 'Okay, here's a number, here's the source. Did you actually copy the right number and the meaning of it? Was it described correctly?' Or, it can be more complicated. It can go into spreadsheets and calculations. Our junior economists, who did a lot of the fact-checking, they would have to go step-by-step all the way back to the beginning of the calculation and verify, you know, when you said you multiplied X by Y, you were actually multiplying them, and so on. And then, once that fact-checking--usually then there would be queries or questions raised in the fact-checking. 'Oh, it didn't seem--this thing didn't make sense, what you did,' or, 'The report spoke about it a little different way. Are you sure this label is right?' And, those queries had to be resolved before the 'Not Fact-checked' label could be removed from the Memo or the PowerPoint. And then, once those queries were fully resolved, then it could go on to the Chairman. And, this took a lot of effort. It slowed things down. It required a lot of hours of staff time. But, I think it made a lot of sense, because CEA's currency in the building was its credibility. And, it's very easy in the White House to become irrelevant. Like, people are vying for influence and access. And, just because you're doing good work doesn't mean anybody's going to pay any attention to it. So, what CEA needed to do is maintain and bolster that credibility as a straight shooter, as somebody who gets the numbers right. And, if we were sloppy with that and word got out and we couldn't really trust them, nobody would read all the stuff that we produced anyway. And so, there was a focus on: We've got to get it right. We've got to preserve that CEA brand, so to speak. And, that was a great learning experience for me. In fact, it is easy to not be nearly as thorough in academic work because--I mean, the reality is the cost of being wrong is likely very low. If I give the Chairman a bad number and he goes in and shares with the press--it might be tweeted the next moment. Or, Kevin says on national news, and then the fact-checkers are going after it. The consequences of making a silly mistake are high. Much higher generally than in academia. And so, it really raised my appreciation for going slow, making sure that what you've done is defensible. Is, as I say in the book, right. Meaning: it's defensible to reasonable people. It doesn't mean it's perfectly predicting the future, it's perfectly getting the numbers right. Future data might reveal that our estimates were a bit off. But, given the information at the time, what was done is defensible to a reasonable person--a statistically savvy journalist.
1:00:16	Russ Roberts: I wonder if that's Kevin Hassett's pet peeve or whether that was standard operating procedure at the Council and part of the culture. Certainly you and I have friends who are careless with facts. I don't have any like that. They wouldn't be my friends. But, people will quote sports statistics, the actor in a particular movie, the year something happened. And, most people say those things with authority and confidence. And, years ago on this program, I said something about how if you then say, 'Are you sure?' to those people, they immediately back down. Immediately. They're incredibly confident, but if you say, 'Are you sure?' they immediately have to concede that not a hundred percent. But, some do. Some say, 'Oh, of course I am sure.' It still means they're sometimes wrong. So, it's a fascinating thing how, as you say, in academic life, big bottom line conclusions with dramatic implications will get checked. People will challenge and look into things. But, it's amazing how many things just get passed by. And, if you ever publish something in a magazine or a newspaper that has a serious fact-checking arm, they're asking you a question; and it's, like, 'What do you mean, how do I know that's true? Of course it's true. I would never say something wrong.' And, of course, sometimes you're wrong because your memory fools you. And sometimes you're careless. And sometimes you're dishonest. So, it's a very interesting thing to be in that intense an environment, I suspect. And, it is definitely different in most other areas. Jeremy Weber: It is. But, it's something I've taken with me and really appreciated. And, being wrong, being pushed on something--in academic settings we're often pushed in certain ways and in policy settings pushed in other ways. I found myself in CEA being pushed not so much in the complicated statistic techniques, but on a more basic understanding of the numbers: You said X, you said this certain number. What does that mean? I'll give you an example. Not from the White House, but from this deer capstone. A big part of the deer issue is deer/cars and cars hitting deer and people having accidents. The Pennsylvania Department of Transportation has a dataset and a subset that are deer-related. And, I had a student compile that, subtotal it by year. What is the level--just the extent of this issue--in the city of Pittsburgh? And, they report out a number. And then, I asked, 'What is that? And, how does it get into the PennDOT [Pennsylvania Department of Transportation] data? Does somebody have to be killed? Does the car have to be totaled? Is it picking up far less severe cases? Well, it turns out in this case, you only get in the data if it were serious enough that the police showed up and filed a report. And, in 96% of the cases, the car had to be towed away. So, imagine somebody just coming in, running with the data, totaling it, and going and reporting there's 25 deer/vehicle incidents in Pittsburgh in a given year. And then, being asked, 'Well, what's a deer-related incident?' and not having a good answer. Or worse, being confronted with this: 'Wait. Animal Damage Control reports that they picked up 600 deer carcasses last year.' And then, you've got two problems. You've got, One: You've created confusion with your statistics. Your statistics were supposed to add clarity. Now people are confused. Is it 600? Is it 25? And then, the second problem is: If you don't have an answer to reconcile the two, you've got a credibility problem. Here you are coming in as the data guy, the data person, and now we can't believe anything out of your mouth. And, the error, again, isn't--it's not that you calculated the total wrong. It's not that you manipulated the data. You just weren't thinking much about the data and the number you actually calculated. And, that would happen in CEA. In fact, very early on, I met with Casey Mulligan, one of your prior guests. And, Casey is such a careful guy. And so, he was a bit of a quality control check on people. And, I gave him a memo with some numbers in it. And, I remember him asking me about one of the numbers: 'Is this one time or this a flow?' It was a dollar value. And, it was embarrassing: I didn't have a good answer. It's $25 billion. It's $25 billion. One year? Every year? Basic question. I wasn't well-prepared to answer it. And, that was just a reflection of Casey being a careful, thoughtful guy and also aware of the audience. The other thing that he did that was helpful: There were several numbers there, and there was a GDP [Gross Domestic Product] output number, and there was another number that was more of a welfare number. And, he's like, 'Let's just stick with the GDP number. The people we're talking to, they get output, they get production. They don't understand opportunity costs. Let's not distract them with this welfare number that they're not going to know how to reconcile the two numbers.' And, that was a thoughtful incorporation of: Who are we speaking to? What are they going to understand? So, it was both a interpretation of the number, knowledge of the audience that Casey was teaching me to be sensitive to, and I've taken that forward and now helping students interact with the city about deer. Russ Roberts: That's awesome.
1:06:28	Russ Roberts: I want to close with a question about civic education, and understanding, and the political process. Alexander Pope said: "A little learning is a dangerous thing." And, it has become popular in recent years to require high school students to learn statistics. You know: Statistics are everywhere. It should be part of basic education. My guess is most of those classes are dangerous. A little learning is a dangerous thing. They get sort of a cookbook course, maybe. They don't get a very thoughtful class. And, they're taught by people who don't understand the complexity of randomness, the complexity of data, causation, multivariate issues. And your book is an attempt to improve that. Your book is an attempt to help thoughtful people get a better understanding of complexity, of measuring things, and how those measurements should be interpreted. It's interesting to me how little of that there is in the world. Part of that is because it's really hard to write a book like this. Most people who understand the concepts are unable to explain the concepts and therefore they can't do it well. And, part of is because I'm not sure how much demand there is. Most people are more comfortable just being told what the truth is. They don't really want to look and see whether the support and evidence for it is reliable or not. But, as thinking human beings, it seems to me that if we want to be civilized and educated, the book that you've written and books like it that are yet to be written should play an important role in being a fully developed person. Because, the world is a complicated place; and statistics are one way to access that complexity. React to that. Jeremy Weber: Well, I appreciate the comment about the book. That is the hope. I talked about students and advanced students as being the audience, but really the audience is broader, and I would take it really anyone who has some basic statistical education and wants to think more carefully about statistics. So, it does have that broader audience in mind, and it does have that aim of thoughtfulness and the broader good that that can bring about. As far as: is it a bad idea to teach statistics to high school students or at some--teach it poorly, I mean, anything taught poorly can be problematic because it's kind of worse than not having been taught it, because you go out thinking you know it, and so you are in a sense inoculated to the real thing. You've heard the prosperity gospel, and then you confuse it with the true gospel. And so, every time you use the word 'gospel,' you are thinking a certain thing and it's the exact opposite of the true thing. So, it's worse than being a blank slate. That said, my experience in teaching introductory students--I mean, these are Master's students, but they're nonprofit-focused. They're policy-focused. They're not coming in because they're generally stat-focused people. They're taking this class because they have to. They're intimidated. I have found that students, I think, generally do want to get it right. There is a side that's, like: 'Just give me the numbers to--I already know how the world works and what the policy should be, and now I just want the numbers to back it up.' But, I've also seen students get really excited about learning from statistics. That is, being surprised by them and having the confidence, having thought a little more deeply about the statistics that: Hey, this can't be dismissed. We have to deal with it. We understand the number, where it comes from, and we have to deal with it. And, maybe they change their view or they come to appreciate: 'Oh, going back to the data, tethering our broader claims to observation about the world is a good, confidence-building exercise.' And so, I do have a side of me that's hopeful that good statistical education can bring about better insight. More prioritization of main causes from tertiary causes. A greater understanding of what's a real problem and what's a problem that just is because people are talking about it, not because it's a real problem. But, I'm fully aware of the challenge. And, things can go wrong very easily. So, I appreciate your skepticism. I share much of it, but I think there's no other way. You got to get in the statistics game and you got to do it better because somebody's going to do it, and they might as well do it well and thoughtfully. Russ Roberts: My guest today has been Jeremy Weber of the University of Pittsburgh. He is the author of Statistics for Public Policy. Jeremy, thanks for being part of EconTalk. Jeremy Weber: Thanks, Russ. It's been a pleasure.

Time

Podcast Episode Highlights

0:37

Intro. [Recording date: February 1st, 2024.]

Russ Roberts: Today is February 1st, 2024, and my guest is economist and author Jeremy Weber. He is the author of Statistics for Public Policy: A Practical Guide to Being Mostly Right or At Least Respectably Wrong, which is the topic of our conversation. Jeremy, welcome to EconTalk.

Jeremy Weber: Thanks so much for having me. It's a privilege.

1:00

Russ Roberts: How did you come to write this book?

Jeremy Weber: The book was in development in my head for probably more than a decade. It began after I spent four years working in the Federal Government, in a Federal statistic agency, the economic research service. And, that was a great place to be as a recent econ Ph.D. grad. And, it was a mix of and more academic research, very policy-oriented research, and generating real official federal government statistics, interacting with policy people.

Then I went into academia to teach statistics to policy students. And, the book I was using, the course that I inherited, very quickly I had the feeling I was more or less wasting students' time, or at the very least there were huge gaps such that when they left my class, they weren't going to be prepared to use any of this to help anyone in a practical setting.

And, from that point on, I started to accumulate notes on things that, if I were to write a book, I would want to include and things that I was now using to complement the statistics textbook to give my students more.

And then, in 2019, I spent a year and a half at the Council of Economic Advisers [CEA] and that was like a accelerator for this whole idea. Because, being engrossed in that environment, gave me many examples, many ideas. And then when I came back to the University of Pittsburgh and had a sabbatical, I said I've got to write this.

Russ Roberts: What is its purpose and who is the audience?

Jeremy Weber: Yeah. I'll start with the audience.

The audience is broad, because frankly, whether it's your first statistics class or your fifth, many of the issues are the same and neither the intro nor the advanced tends to do some things well. In particular, the communication of statistics to a non-academic audience, the integration of context and purpose of the moment or of the organization or of the audience into what you're presenting--its significance for the situation at hand--we tend to not do that well, I think, at the undergraduate level. Or for Ph.D.s who are in their fifth year of econometrics. So it's--the audience is broad.

3:50

Russ Roberts: So, it's a very short book. There are a couple of equations, but there as--kind of like illustrations. And, what is spectacular about the book I would say--and I would recommend it to non-technical readers--what is very powerfully and well done about the book is giving the reader who is not an econometrics grad student, a very clear basic understanding of terms that you've heard all the time out in the world from journalists and occasionally a website you might visit that highlights academic research.

So, you'll learn what a standard error is, you'll learn what a confidence interval is. But, it's not a statistics textbook in that sense.

However, those--that jargon--and other concepts that are used widely in statistics are very intimidating, I think, for non-academics.

And your book does an excellent job of making them accessible.

And then, of course it goes well beyond that. You're trying to give people the flavor of how to use these concepts, use data that's produced in all kinds of ranges of applications, calculation of means and correlations up through regression results that is more sophisticated. Statistical analysis. You're going to give people insights in how to use them thoughtfully.

And, as you point out, no one teaches you how to do that in graduate school or in undergraduate if you take statistics. They're taught more as, I would say, a cooking class. You learned to add certain ingredients together. If you want to make a cake, you need flour and you need eggs and you need this and a certain amount of heat. Whether it's going to be a good cake or not is a different question. Whether that cake belongs to a certain kind of meal or a different meal, those are the things that practitioners learn if they're lucky. But, you're not taught those things.

And certainly people who don't go to graduate school or don't take a number of statistics classes in college will never, ever have any idea about it. So, I just want to recommend the book. If those kind of ideas appeal to you, you'll enjoy this book and it will be useful to you. Is that a fair assessment?

Jeremy Weber: That's a very fair assessment. You use the cooking example. I allude to kind of a vocational example in the book, where our statistics education, I would say teaches--it shows you: Here is the saw. And: Here are the parts of the saw. And, maybe we even, like, start it. And then, we put it down and we move on to another tool. Or, maybe we work with 10 different types of souped-up chainsaws, really sophisticated chainsaws. But, we're just like, these are again, the features and parts of the chainsaw.

Actually going out and cutting down trees, like, do that--we don't do that. That is--we don't do that. We know people do that, but we're not doing that.

And, that's a bit of the gap I'm trying to fill.

7:07

Russ Roberts: And, the more standard metaphor you also use is the hammer. And, we may come back to this, but of course the standard, the cliché'd condemnation of mindless statistical education is: Once you have a hammer, everything looks like a nail. And, it's really fun to run regressions and do statistical analyses once you understand how basic statistical packages work, without wondering whether it's a good idea, what's the implication of the analysis, how reliable is it, and does it answer questions as opposed to just provide ammunition for various armies in the policy battle?

And I think for me, that's one of my concerns. We'll come back and talk to it later I hope in terms of how we should think about the education in the practice of statistics. But, it's such a fun tool. It's a lot more fun than a hammer. It is more like a chainsaw. It's noisy and attracts attention and people like to cut down trees. So, there is a certain danger to it that your book highlights--in a very polite way--but, I think there's a danger to it. You can respond to that.

Jeremy Weber: Yeah. It is fun until it's not.

And, when it's not is when you are using this regression tool and you've maybe used it with the academic crowd; and that was fun. But then, you go to another crowd--the City Council crowd or some sort of more non-academic crowd--and you present it; and suddenly it's not fun because nobody knows what you're talking about and the conversation quickly moves on and you feel, like, out of place. Fish out of water. You've miscommunicated. People are confused. And now they're ignoring you.

Russ Roberts: But of course, the flip side also occurs, right? The scientist in the white coat. And, in this case it's the economist or policy analyst armed with Greek letters in their appendix. At least in their paper if not their physical one.

And, there's an awe of these kinds of people: 'And, obviously they're smarter than I am and obviously they're experts. Maybe I'm overly pessimistic here.'

A lot of times I feel like in those settings outside of academic life, there's a lot of trust in the reliability of numbers produced with what I would call standard practice. And, once you follow the rules of standard practice--which means statistical significance, confidence intervals and so on, and you frame your work with those footnotes, then you're credible.

And just simply because you're in the arena and you've been trained accordingly, you're a bit of a shaman. And, I think that's a little bit dangerous.

As is the opposite: 'Well, they're obviously wrong. They are a bunch of academic eggheads and they don't know what they're talking about.' So, I think there's an interesting challenge there, I think, when we go out into the world.

Jeremy Weber: Yeah. You're right. In certain environments there's that deference, that credibility conferred because of the mathiness, because of the training, the aura. I agree: That is a case that does happen in certain environments.

10:60

Russ Roberts: Now, I argued in a recent episode that statistical analysis is used more for weaponry than truth-seeking in the political process. And, I think it was misunderstood by some listeners. I think it's very useful to politicians to have data numbers and policy players. But I don't think they're so interested in the truth, and I wonder how your book would be perceived by them.

Jeremy Weber: Yeah. I agree with your assessment. Primarily weaponry, especially in the D.C. [Washington, D.C.] area.

But, if the weapons being picked up are actually real, understood measurements that accurately reflect an issue--they don't reflect the full scope. They're being used selectively. But if there's good measurements out there and there are competing parties fighting, it means the party is going to pick up the most effective weapon that most appeals to the audience out there.

And so, if there are, in a way, better weapons out there that can be picked up, I think you have a greater tendency to some major problems being avoided or opportunities pursued.

And I'll give you a concrete example. When I was in the White House, the commerce department was petitioned by some uranium mining producers for protection. They didn't want imported uranium into the United States. The Commerce Department conducted an investigation, did a Report on the issue. They did their own--they did a survey. They presented some statistics in this Report that went to the President recommending restrictions on imports. Okay? You know.

So, Commerce Department, they've got their weapon. All right? And, CEA got involved--

Russ Roberts: The Council of Economic Advisers--

Jeremy Weber: That's right. Council of Economic Advisers got involved.

I grabbed some other data. I did some analysis. I generated, you could say, another weapon that I thought was actually a better depiction or reflection of the economic reality and what was likely to happen under the Commerce [Department of Commerce] proposal.

All right. So, we got together Commerce, other agencies in the room, and in a way we had our battle. We picked up our weapons. I think we--at the end of the day--we ended up at a better place because I was able to pick up a weapon and there was this back-and-forth with the data. So, but, had CEA not been there, nobody or those reports that I relied on from the Energy Information Administration had that not been there, everybody would have bowed down to Commerce and they would have rolled right through, and the President would have said, we've got to import or restrict uranium imports so we can prop up these several producers out in Utah--at the expense of the nuclear power industry and electricity consumers.

Russ Roberts: Yeah. That's a great example. In theory, the Council of Economic Advisers--and I think it to some extent plays its role as best as I can understand it--is more of a technocratic fact-checker in some dimension of advocacy by other agencies and in theory is somewhat unbiased.

In this case I assume the argument was that this was going to create a lot of jobs in Utah. Was it Utah?

Jeremy Weber: Yeah. There's several places where uranium mining occurs. Utah is one of them. That's where the companies--at least one of the companies that was filing the petition was located. You had the argument--there was a national security argument. There was a whole resiliency of the uranium supply chain argument. There was jobs argument, too. That's right.

Russ Roberts: If I can ask, what was the key finding that you felt was at least somewhat decisive in derailing a strong impulse toward restrictions on imports?

Jeremy Weber: The key finding was: Commerce Department had said at $55 a pound, we think the domestic uranium sector is going to produce the amount that we're going to require domestic nuclear power producers to purchase from them. So, this is not going to increase domestic prices for uranium much. $55 is just a little bit above what the market price was at the time.

My analysis said: Unlikely. The price on the domestic market will be a lot higher. For the uranium sector in the United States to produce six million pounds of uranium--which is what the requirement was going to be, the buy-American requirement, so to speak--you are going to need a dramatically higher price. And, that price is either going to get passed on to consumers--especially in places with a lot of nuclear-generated electricity--or, the nuclear producers were going to eat it and it was going to push some of them over the edge, particularly in places like Michigan and Pennsylvania where there's--and Ohio. Important states.

And, so my basic argument was: this restriction is going to call[?cause?] prices to increase a lot. And the key statistic--I did some more sophisticated stuff in the background, but I didn't bring that in as the Game Plan A in the meeting. I brought in two figures--two graphs--that simply showed, look, in recent history, the price of uranium has been way above your price for several years and the domestic sector didn't come anywhere close to producing what you're saying they would produce now at a much lower price.

So, this simple descriptive statistic I think convinced everyone in the room except Commerce that there's no way the domestic price is going to be $55 and six million pounds are going to come out of the ground. So, in the Decision Memo to the President, our estimate, CEA estimate of what the price is going to do, what it's going to do to electricity prices and electricity consumers was in there. And, I don't know what proportion of influence that had, but people who were familiar with the matter said it was really important that that point was made.

Russ Roberts: For economics majors out there, this was a debate about the elasticity of supply. A phrase that I don't know if it's been uttered more than a couple of times in the history of this program. Meaning how responsive is production to changes in price? And, if the answer is not very much, then you're going to need a much larger price to make the market work effectively; and the demand for uranium once foreign supplies are unavailable is going to push the domestic price up much higher than $55.

And, that's very nice. Now of course that as you point out--you had more sophisticated stuff in the appendix. But, the fundamental--often the facts can be persuasive or at least provocative [?] reconsider a position.

18:46

Russ Roberts: What are your thoughts on our profession generally and our ability to establish something like a truth on the basis of statistical analysis?

So, for example, what's the effect of the minimum wage on employment among, say, low-skilled labor? The profession used to believe the answer was minimum wage is very bad for low-skilled labor. It would cause a lot of jobs to go away. In recent years, there've been a lot of thoughtful people who've made the opposite argument: it's effects are either small or zero. There's been pushback against that by other people saying actually that's wrong: In the short run it might be true; in the long run, it's big; or, you didn't fully measure it correctly.

And, if I said to an economist: 'What's the effect of a, say, 25% increase or 15% increase in the minimum wage?' it would depend on who you asked. And, that's weird. If you ask a physicist what the effect of gravity is, they don't argue about it. There's a consensus. We don't really have those kind of consences--I don't know if that's the right word--consensuses in economics, it seems to me. Do you agree?

Jeremy Weber: I agree. And, I think the key difference is that, as long as we're in the earth realm, gravity is pretty contextless. It's not context-dependent. Social settings are so varied, and so the situations in which we estimate these relationships are oftentimes conditioned by the moment in history, the place.

And, I'll give you a concrete example. My subject area of expertise is energy and environment. I've done work on fossil fuel extraction, effects on communities. A big question that the literature was considering several years ago was if you have fracking--if you have natural gas drilling in an area--what happens to property values? Somebody looked at that question in Pennsylvania and found, well, for many homes it will be a negative effect. People are not going to want to live near this, particularly homes dependent on groundwater. Okay? That's Pennsylvania.

I looked at Texas. I found housing prices go up quite a bit in the vicinity where the fracking took off. Well, the reason for the difference was--or primary reason--in Texas you tax natural gas wells as property. So, when you drill a well, the full value of that well enters the tax base. That's like we just built a bunch of million dollar McMansions and now those people are paying taxes on those houses. That's going to the local government. That's going to the schools. It turns out in Texas then they lowered the property tax rate and so people's tax bills were going down, the school was getting more revenue, and property values generally went up in the area.

That's a very different finding than in Pennsylvania where they don't tax. There's no revenue generation for the school, no reduction in property taxes. Same basic phenomena of fracking, fundamentally different effects on this outcome because the context--the policy environment--was so different.

And, I think that's just one illustration of how--are we raising the minimum wage in an area where the market wage is already pretty high and we're just going to basically move it close to the market wage? Or, not?

So, I think part of the reason why it's hard to come up with a consensus is because context matters; and that matters certainly for policy. That consideration of context is something that I emphasize in the book so much.

22:40

Russ Roberts: Of course, you would like to think that the fundamental market forces are the same. They may be, in, say, the case of minimum wage. And, people might disagree about what those are. That was, again, not so true I think in the past--say, 50 years ago--but is much more true, say, in the last 15 or 20 years.

But, I do think there is a feeling among younger economists and I would--Jeremy, I put you in that group relative to me. Just looking at you, I would say--

Jeremy Weber: I appreciate it--

Russ Roberts: No problem. I think there's a concensus--not a consensus--there's a flavor of recently-trained economist who says: 'I don't look at theory, like what theory says about what the minimum wage impact should be or is likely to be. I just look at the data. I just read the output from my statistical package, and I look for the truth, and whatever it says, that's our best understanding of how the minimum wage affects low-skilled labor at this point in the areas I looked at it.' And, I find that an untenable view, but I think I'm in the minority in the modern world. Is that true?

Jeremy Weber: No. I agree with your assessment that there is a tendency, culture shifting or it has shifted, where we just want to go right to the data and not do the heavy thinking beforehand about setting things up, in a way: What are we trying to answer? What is the general theory that we're trying to test? And, we're just going into the data too quickly.

And, one of my recommendations in the regression chapter is: never run a regression without a clear purpose for doing so. It is so easy to be led in strange places just by kind of meandering through the data. And, you know, we all know we're not supposed to look for certain results, but that is so easy to do. You start getting a hunch: 'Oh, this would be a great story if it works out this certain way.' And, lo and behold, then you start looking for that story and you're like, 'Oh, it doesn't work quite right here, but what if I subset the data this way?' and suddenly the story emerges and then at the end of the day you're, like, 'Well, I can sell this story.' Like, 'This is coherent enough.' But is it a manufactured story? And, I think that does happen more often than it probably should.

Russ Roberts: I remember hearing from George Stigler, who was a professor of mine at the University of Chicago, that in his day there were no statistical packages. I don't even think they had punch cards and computer analysis. They had fancy calculators. The kind of calculators that were used, say, in the Manhattan Project. And, they would have a handful of variables--because they didn't have the amount of data we have now. This is like, say, the 1940s and 1950s. And, he said you would decide one or two things you'd run a regression on and it would take a long time and lots of calculations and then you'd find out what that answer was. And, because you were only going to do a couple, you thought very long and hard about what belonged in the analysis and what didn't. And when you were done, that's what you found. And, if you didn't like it, you had to then decide what you were going to do with that. And, the answer wasn't: Well, I'll run another 30 or 40 until I find something more amenable to my preexisting views.

But, I think the real difference in the modern world, we not only have--you can run a regression in a fraction of a second--we have immense amounts of data. And, because we have so many different variables and different ways of manipulating them, you do have to have some kind of theory as to how you're going to do that wandering through the data you're talking about.

And, in particular, otherwise--I'm going to say it differently. Even though we have lots of variables and lots of data points, we're not close to having all the data on everything that's relevant for the decision. We don't have data on people's moods. We don't have data on their childhoods, and how they were raised, their genetics. It's so many variables obviously that could be important. And so, we pretend we have all the data. And then it's just a question of throwing out some of it that we don't think is relevant. But, we always have in the back of our mind this haunting ghost that--that almost by definition in a social science perspective, you can't have all the data.

But that's unpleasant. That's no fun. I want to be in the sandbox. So I pretend I've got enough. And, I think that's the danger of theory-free exploration, because you don't have enough and you're prone to your own biases--confirmation bias and other things.

Jeremy Weber: There is that danger of maybe we constructed a narrative that it's just somewhat disconnected from reality because it's only found through torturing the data.

I think there's another danger, and it's similar to--there's this book called The Shallows: how the internet is making us think more shallow,--

Russ Roberts: Nicholas Carr--

Jeremy Weber: superficial. Yeah.

I think there's a parallel in the data or the statistics world. I was recently talking with somebody involved in a data science program and they brought in employers of their graduates and said, 'What's your assessment of the skills and how we're equipping our students?' And, they said, 'Look, we love their ability to manipulate data, calculate stuff. They can't [?] tell us the meaning of these things they're calculating for our purpose, for our organization. So it's like, it's like: Data, data everywhere, but understanding is nowhere. And, that's very easy to do when at a click, you've got oodles of data, oodles of regressions you can run. You're not slowing down and doing what Stigler had to do. And, that is, like, think twice, or measure it twice, cut[?] one sort of thing. You just run right into it. And, that's what I stress for my students.

I said, 'Before you touch the data, stop. Think.' This is something I'm trying to get across in the book. It's just: Slow down and think about what you're trying to accomplish. What is the problem? And then, with that clarity of understanding, you go and you learn something from the data. Then step back. Get away from the data for a minute and reflect for a few days on what you have calculated. All right? And go forth this way so that you have a deeper meaning. There's greater understanding being created by what's being calculated rather than 'I calculated a bunch of stuff and I'm reporting measurements.' And, I think people will find it interesting.

30:23

Russ Roberts: Let's go back to the chainsaw analogy. I think it's useful. Also makes me think of the current President of Argentina who liked to campaign--I don't know why it became his campaign image. He may be the most sophisticated economic thinker in office at a high level in the world right now. He's a pretty good economist, at least in terms of explication in the video clips I've seen of him. But, it's fun to use a chainsaw--in theory. I've never used one, by the way; but I can see the appeal of it.

So, imagine if I were going to have a course in how to use a chainsaw and I said. So, here's where the gasoline goes. Make sure you close the cap well and we'll practice that. And, here's how you turn it on. In the old days you pulled a cord and it was this exciting causal connection, and a loud noise results. And, here's how to make sure you don't cut off your fingers, because it's a dangerous weapon. Or yourself. But, over there some trees. Have at it.

And, that would be a weird course, because the students would say, 'Well, I don't really know how to cut down the trees thoughtfully or carefully. And I'm worried that if I cut them down the wrong way they will fall on innocent bystanders in houses.' And, it's a strange thing that, that's the response of the employers of students who study data science. It's like: 'Wow, they're really good at.' Meaning what? They're really good at turning on the saw and they're really good at refilling the gasoline and refueling it. But, they're not really good at making sure that houses don't get crushed. Something is wrong with this picture.

Jeremy Weber: Exactly. In fact, I'm experiencing this just now. I'm running a capstone class here at the Graduate School of Public and International Affairs, part of our Master's program. We serve a client--City of Pittsburgh. It's about our deer overpopulation issue. And, I have a student who--great student--we sent her to look at the data on deer incidents, deer-related police reports on a neighboring municipality that had implemented a management program. She's getting the data, she's looking at it; and she stopped and she says, 'I think maybe somebody else should take over for me because I'm not sure I'm the best person to learn from this data, calculate this stuff.' And I said, 'Well, what classes have you taken?' Well, she's taken two graduate level statistics courses. She's looking at the chainsaw, she's seeing the tree, and she's, like, 'I'm going to step away.'

Russ Roberts: Good for her.

Jeremy Weber: I'm like, 'You are exactly the right person to be doing what you're doing. Let's do this together. Let's start here. Let's take one step at a time.' So, it's going to be great for her, it's going to be great for us.

But, I'm somewhat appalled that she's taken two--a year's worth--of statistical education, postgraduate, and she doesn't have the confidence to pick up the chainsaw.

And I somewhat don't blame her. I understand it. It is intimidating to walk into the room. It's intimidating to take up that chainsaw and start laying it into that massive oak. And, the equivalent in reality is then stepping into the room with a City Council person and reporting these numbers that then the staff are going to look into, maybe a journalist looks into; maybe somebody gets the data and shows that you did something stupid and you end up looking like a fool. And so, the students are averse to doing that.

34:19

Russ Roberts: I think there's two things going on here, though, at the same time. Theoretical programs in law, or business--sometimes students and outsiders, employers will complain: Well, they taught you legal theory but they don't teach you how to be a lawyer. Because to be a lawyer you have to learn how to read the client. You have to understand when you need to push back against the client's demands. You need to understand how to read a jury. You need to know how negotiate in a settlement question. And, law schools don't do that.

Same for business schools. Business schools teach theory of finance, theory of marketing, and so on. But, when push comes to shove, only life gives you the education you need in the trenches and in the real world.

And, I think what's going on here--and your students are a special case. But, I think most of the time the things you're taught in graduate school in statistics or data science are really everything you do need to know about how to use the chainsaw. It's that you don't know how to use it thoughtfully. And, that's an entirely different thing than you don't know how to use it in the real world. You can use it in the real world really well. You can put up a big set of tables and charts and appendices, but there's no thoughtfulness to it.

So, I think--when I taught in a Master's Program at George Mason, I taught a class called How to Think About Numbers. And, that was for me the things that students weren't getting in a cookbook econometrics or a statistics class. But in general, those classes aren't taught. And, the kind of things you teach in your book are not taught.

And, my question is: Why do you think that is? Why do you think the world wants somebody--because I really think that employer often wants somebody who is really good using the chainsaw and they don't care where the trees fall. They just want a really sophisticated user of the saw. And it's kind of strange that is the way it seems to me that the world works.

Jeremy Weber: I think that you're absolutely right. I think the reason they're not taught--it's not taught--that sort of more careful thinking about the numbers and those practical issues--is: who is teaching--who teaches statistics courses in university, undergrad or grad? They are academics. Okay. How do those academics use their statistical skills? They use it in research articles to academic audiences. They are then attuned to what is the editor and the reviewers--what are they going to go harp on? What will be the bars that I have to pass? What are they going to scrutinize?

All right. That is what they're doing, nine to five.

Then when they go to teach, they are teaching students to do what they do, by and large. Okay? And, the unfortunate thing is: It is as if they are teaching students, 'This is how you speak to and relate to this tribe.' Okay? This Swahili tribe.

Well, then the students go out and the majority of them are not going to be writing academic articles dealing with reviewers. They're going to be using these for employers, businesses, or nonprofits, or city council. And that's a different tribe. And then we're surprised, 'Oh, they're speaking Swahili to the Germans here and there's miscommunication or not understanding that's being conveyed.'

And so, the emphasis is off because the people teaching are accustomed to speaking to a different tribe. So, they're spending all of this time on three different ways to refine your standard air calculation because the reviewers are going to ask about that. They're spending all this time on refinements to identification because they know they're going to get nailed on that. And so, they convey that to their students. And they're not spending the time on, just: Are the data good? Like: What do your variables mean? A one-unit increase in X--like, what is that? Is that big? Is that large? Is that small? And, don't tell me statistical significance or not. Like, I want to know: is a 0.5 increase in that thing, should I care?

We're not teaching that for the most part.

Russ Roberts: I'm the president of a college. We're small. We're hoping to add an economic/public policy major in the coming years. And, if we do that while I'm here, it will emphasize the challenges and limitations of the chainsaw--of statistical analyses alongside the hammer/chainsaw part, which is you got to have that if you're going to enter the battle. If you're going to be in the arena, you better understand how your opponent's weapons work. If you're going to claim they're dangerous or they don't work well or they're inaccurate, you need to understand how to use those weapons and then explain the limitations.

And, part of me says--the romantic part of me--says: Well, this will be good for the country, for the state of Israel because there will be a set of bright, articulate people with a grounding and philosophy, in addition, but also who understand the limitations of statistical analysis.

The question is, is anybody going to hire those people? This is a different way to look at this question of this mismatch. Does anybody want someone who is going to always remind them that this finding which they want to wave around and print on a big banner might not be true?

Jeremy Weber: I'm going to say a guarded optimistically yes.

Russ Roberts: Good.

Jeremy Weber: When I teach policy analysis--so I teach these capstone classes that are more client-oriented--I use the analogy of a lawyer and a client. And I say, 'Look: we are lawyers. We are like in a lawyer-client relationship. Clients are not well served by lawyers who just cherry-pick things in their communication with the client.' Now, there are two different levels of interaction here. There's going before the judge and the jury. Which the lawyer is not going to present damaging things to that audience. But, I would think that the client wants and is best served by a lawyer who in the private confidence of the lawyer-client relationship is shooting straight. Fully understands weaknesses of legal theories, the strength and limitations of the evidence, and implications for the client's case. And so then the client, knowing that--okay, it's not the client that needs to be convinced, it's whoever the client is turn arounding and serving or speaking to, that needs to be equipped well.

So, I do think clients want lawyers who are not fly-by-night known for inventing things, cherry-picking things. They want people who will shoot it straight with them so they can then turn around and make better decisions that are going to be more persuasive, more bulletproof to the audience they're working with. The Chairman of the Council of Economic Advisers would not be well-served by me being a data lackey and just saying, 'Oh, Kevin, I think you wanted this number. It sounds good. Go run with it.' Because Kevin [Kevin Hassett, CEA Chairman, 2017-2019--Econlib Ed.] is going to turn around and go to a meeting with thoughtful, sometimes aggressive people who are going to find holes in it and potentially make Kevin look very bad. So, Kevin actually wants somebody who is a straight shooter and not a data lackey.

Russ Roberts: I love that.

42:50

Russ Roberts: Let's take an example from the book which I very much enjoyed and hadn't thought of this way. I thought it was really great. You talk about the well-known idea--hard to remember, surprisingly hard to remember--that correlation is not causation. It's well known but remarkable how many times people either forget it or want to forget it. So, it's a great point.

But you make a deeper point, and I think it is quite profound and very, very rarely thought of, which is: You really should be thinking of correlation and the magnitude of causation. So, there could be a correlation. Sometimes it is causation. But, that's not the only thing we care about. In fact, almost always we care about the magnitude of the impact, not just that they're correlated.

And, the reason that matters--you say that, 'Oh yeah, sure, sure, sure, that makes sense.' But the point you make, which is fantastic, is that in the real world, the world we live in as opposed to the textbook, there's more than one thing changing at the same time. So, past data that we look at to examine relationships is of course affected by more than the thing I'm looking at that I'm calling the causal factor.

So, you give the example of your storm drain. Why don't you share that and generalize it to other issues?

Jeremy Weber: Yeah. And, I use this example in my classes as well. A few years ago, my drain in the back of my house was overwhelmed in a storm and it flooded my basement. This happened twice. And, my neighbor learned about it and she quickly said, 'Well, this is climate change. Clearly. We're having more intense storms,' which suggests that the problem is the quantity of water hitting the drain. And, I didn't think too deeply about it. I kind of wanted to get rid of some asphalt anyway. And so, accepting this premise that the storm had been more severe and that's why it flooded recently and not in prior years, I went then and rented a concrete saw and dug up a bunch of asphalt and replaced it with grass so the water could percolate down and the drain would drain a smaller area. Then the drain flooded again. And clearly, water falling and my drain flooding, these are causally related. Intuitively there's a connection. But, what was the main reason why my drain wasn't able to handle this water?

It turns out it wasn't because the storms were more severe than they had been in the past. And, it took a conversation with a plumber visiting my neighbors. Said look, 'You don't understand how your drain works. It's draining in this other direction. There's a kink here. If you go into your garage, you're going to find an access point.' I pulled that out and there's some mud clogging it, clogging the drain. I pulled that out. It's never had any issues since then. We've had a tremendous storm since then. The main cause was the clog in the drain.

And, tying it in with climate change and the storm intensity, that was a distraction. Yes. It might be true. I didn't go and look at the data. It might be true that those storms were in fact a bit more severe. And, it might be true that that was driven by rising greenhouse gas pathogens. Might be true. All right? But, we could have solved climate change completely, and my drain would still have been overwhelmed at the next storm. It was a secondary/tertiary issue. The primary issue--the primary causal factor--was the clog in the drain.

And without understanding that, I was just going to be throwing money and effort at a tertiary issue that wasn't going to solve the problem.

Russ Roberts: And, that's just so common in policy arguments. Of course [?]. Correlation--not causation--is the English version of a more pretentious Latin phrase, which I always loved, post hoc ergo propter hoc--'after this, therefore, because of this.' And, your point is that: Yeah. After this sometimes is because of this. What happened after is because of this thing that happened before. But, eight other things happened along the way. And, the fundamental question isn't whether this one affects that one, but by how much relative to the others.

And so many policy debates are about--again, going back to maybe the first EconTalk episode with Don Cox. I think we talked about this, what he calls the 'dreaded third thing.' You have two variables, one affects the other. There is that third thing that--it's actually on a later episode with Don. Or some essay he wrote.

It's not the first one--that was on parenting. But, the dreaded third thing is that the world is complicated. There's actually more than three. There's the dreaded third, fourth, and fifth thing. And, the fundamental question is if you want to affect the variable that you're caring about, is it the one you're focused on or is it the third, fourth, and fifth one that have the bigger bang for the buck? Statistics can help you answer that, but you do have to keep it in mind and to look for it.

Jeremy Weber: And, unfortunately, our statistics culture with the emphasis on statistical significance is usually focused on that question: is there any causal relationship at all? Is the coefficient zero or not?

And, as my chapter--I think one of the most important chapters in the book is: Know large from small and explain the difference. Is that we're so used to using statistical significance as a crutch for saying: Is this important? Does it matter? And, the reality is, like you said: Ten things are probably causally related to this outcome we care about, but for policy purposes, we obviously want to prioritize. We're not going to make much progress on the problem if we're focused on this fourth-order issue that yes, is causally related, but the magnitude is so small.

There's an issue here with political speech that's really tricky that I just want to point out--I faced it a lot in reviewing speeches by White House officials--where you have two or three things presented together as equivalent contributors or causal factors. One or two of which might've been Administration-driven, and maybe they don't even mention other factors. And, the reality is: yes, all these things are causally related, but--I'll give you the concrete example. The rising oil production in 2018 and 2019 was primarily driven by higher energy prices. Did deregulatory efforts help? It certainly didn't hurt. And, it probably--intuitively it would be causally related. But, if you speak of those two factors in the same sentence and you're going to communicate to the audience: they're equally responsible for this rise in U.S. energy independence and so on, when they're not. 95% of it was just producers who are responding to price.

Russ Roberts: Yeah.

50:21

Russ Roberts: The other example that I think about a lot is--and of course as you say, political speech, a lot of times things get emphasized not because of their magnitude, but because of their salience in the minds of voters and others.

One I think about a lot, a friend of mine talks about--he may be listening to this episode. But, he will argue that that trade with China is the source of many of our cultural malaise--much of our cultural malaise, many of our cultural problems. I'm not convinced of that. In fact, I'm pretty sure it's not true. Whether it's true or not, it has a very strong political impact when people hear that.

The reality, I think, is that there are many, many things going on, many at the same time. It's hard to know whether those things are independent of each other. Some could be caused by the economic challenges that certain parts of the country face in response to trade with China. But, certainly using China as a source of fear--Chinese trade--is very powerful and very effective. Whether it's true or not is much, much harder to establish. And, in particular, it could be true but the magnitude is quite small relative to the other factors. But, as a politician, often that will be irrelevant. It'll be invoked simply because it's effective.

Jeremy Weber: This is very true, and this happens. I think this is a good moment to make a point that I make upfront in the book. And, it speaks to some of your pessimism around data and our ability to untangle things and so on.

There are two camps of people, I find. Generally those people who are data enthusiasts: We're going to be able to solve the issues, identify the priorities, the results disputes if we just let the data speak. We look at the evidence, we do evidence-based policymaking, this is it.

And then there's another group that says: There are 'lies, damned lies, and statistics.' Like: It's a tool for manipulation--as you say, it's just weaponry to shoot at people. It's worse than not helpful. It's distracting, misleading, and so on.

And, I speak to both of them. And, an important point I make to the 'you can say anything with data'-crowd is: Statistical claims are with us always. We cannot help but make statements--claims--about what is common, what is general, what is causing X versus Y. We will make those statements. It's better if we tether them to actual observation. Because, we're going to make them: The politician is going to make them. The nonprofit leader is going to make these statements about what's generally the case. And statistics at least helps constrain us somewhat. But, those claims are going to be made anyway. And if there's a culture there, a habit of good use of statistics, it's at least a constraining power on specious claims about what is generally the case, what is rare, what is common, and so on.

Russ Roberts: Yeah. I really like that part of the book. And I just want to say that if you had to pick a religion for me where one religion is 'Data analysis reveals the truth' and the other religion is 'Lies, damned lies, and statistics,' I would be in the latter. I would be in the 'lies, damned lies, and statistics'-group. And, a lot of people then conclude: Oh, obviously I'm not a scientist. I'm irrational. I believe in going with your gut. The only reason I push the 'lies, damned lies, statistics'-scripture is because most of the religious enthusiasm is at the other end.

I'm actually in a third camp. I'm in the camp that worships the idea that it's complicated. And that reality is beloved by neither the 'statistics reveals the truth'- and the 'lies, damned lies, and statistics'-group. So, in reality, I actually am in this third group. But I think I am often misunderstood as being in the second. But I do think, just for the record, there's nothing worse than anecdotes. They're dangerous--as are statistics misused. So, it's complicated.

Jeremy Weber: Yeah. And, I think I probably--I mean, I think I'm where you are, Russ, generally.

And by speaking to the, 'Let's just get in the data-driven car; it's going to tell us where to go'-people, and the, you know, 'statistics are just manipulation,'--by speaking to both of them and the access or limitations of the extreme, I think the result is you would end up somewhere in the middle. Okay. We can't stop trying to tether our claims to summaries of observations. Not just what your brother's cousin said--and that's your one data point and then you extrapolate. But rather, what many people have said or what's been measured in many places or moments.

56:01

Russ Roberts: There's a lengthy discussion in the book relative to what I would have expected on fact-checking. And in particular based on your experience in the Trump White House. Describe how thorough that was, and why that was, and how you felt about it.

Jeremy Weber: So, one of the first things I learned in arriving at CEA was they have a fact-checking process. And, the Chairman at the time, Kevin Hassett, was insistent on before anything reached him, it should be fact-checked. Before anything--certainly anything that left CEA--needed to be fact-checked. And by that, what was meant was the original author of the Memo, the facts in it would have to pass it off to somebody who was not involved in it, and they go through all of the factual claims and verify them.

And, sometimes that was very simple. That was just: 'Okay, here's a number, here's the source. Did you actually copy the right number and the meaning of it? Was it described correctly?' Or, it can be more complicated. It can go into spreadsheets and calculations. Our junior economists, who did a lot of the fact-checking, they would have to go step-by-step all the way back to the beginning of the calculation and verify, you know, when you said you multiplied X by Y, you were actually multiplying them, and so on. And then, once that fact-checking--usually then there would be queries or questions raised in the fact-checking. 'Oh, it didn't seem--this thing didn't make sense, what you did,' or, 'The report spoke about it a little different way. Are you sure this label is right?' And, those queries had to be resolved before the 'Not Fact-checked' label could be removed from the Memo or the PowerPoint.

And then, once those queries were fully resolved, then it could go on to the Chairman.

And, this took a lot of effort. It slowed things down. It required a lot of hours of staff time. But, I think it made a lot of sense, because CEA's currency in the building was its credibility.

And, it's very easy in the White House to become irrelevant. Like, people are vying for influence and access. And, just because you're doing good work doesn't mean anybody's going to pay any attention to it. So, what CEA needed to do is maintain and bolster that credibility as a straight shooter, as somebody who gets the numbers right. And, if we were sloppy with that and word got out and we couldn't really trust them, nobody would read all the stuff that we produced anyway. And so, there was a focus on: We've got to get it right. We've got to preserve that CEA brand, so to speak.

And, that was a great learning experience for me. In fact, it is easy to not be nearly as thorough in academic work because--I mean, the reality is the cost of being wrong is likely very low. If I give the Chairman a bad number and he goes in and shares with the press--it might be tweeted the next moment. Or, Kevin says on national news, and then the fact-checkers are going after it. The consequences of making a silly mistake are high. Much higher generally than in academia.

And so, it really raised my appreciation for going slow, making sure that what you've done is defensible. Is, as I say in the book, right. Meaning: it's defensible to reasonable people. It doesn't mean it's perfectly predicting the future, it's perfectly getting the numbers right. Future data might reveal that our estimates were a bit off. But, given the information at the time, what was done is defensible to a reasonable person--a statistically savvy journalist.

1:00:16

Russ Roberts: I wonder if that's Kevin Hassett's pet peeve or whether that was standard operating procedure at the Council and part of the culture. Certainly you and I have friends who are careless with facts. I don't have any like that. They wouldn't be my friends. But, people will quote sports statistics, the actor in a particular movie, the year something happened. And, most people say those things with authority and confidence. And, years ago on this program, I said something about how if you then say, 'Are you sure?' to those people, they immediately back down. Immediately. They're incredibly confident, but if you say, 'Are you sure?' they immediately have to concede that not a hundred percent. But, some do. Some say, 'Oh, of course I am sure.' It still means they're sometimes wrong.

So, it's a fascinating thing how, as you say, in academic life, big bottom line conclusions with dramatic implications will get checked. People will challenge and look into things. But, it's amazing how many things just get passed by.

And, if you ever publish something in a magazine or a newspaper that has a serious fact-checking arm, they're asking you a question; and it's, like, 'What do you mean, how do I know that's true? Of course it's true. I would never say something wrong.' And, of course, sometimes you're wrong because your memory fools you. And sometimes you're careless. And sometimes you're dishonest. So, it's a very interesting thing to be in that intense an environment, I suspect. And, it is definitely different in most other areas.

Jeremy Weber: It is. But, it's something I've taken with me and really appreciated. And, being wrong, being pushed on something--in academic settings we're often pushed in certain ways and in policy settings pushed in other ways.

I found myself in CEA being pushed not so much in the complicated statistic techniques, but on a more basic understanding of the numbers: You said X, you said this certain number. What does that mean?

I'll give you an example. Not from the White House, but from this deer capstone. A big part of the deer issue is deer/cars and cars hitting deer and people having accidents. The Pennsylvania Department of Transportation has a dataset and a subset that are deer-related. And, I had a student compile that, subtotal it by year. What is the level--just the extent of this issue--in the city of Pittsburgh? And, they report out a number. And then, I asked, 'What is that? And, how does it get into the PennDOT [Pennsylvania Department of Transportation] data? Does somebody have to be killed? Does the car have to be totaled? Is it picking up far less severe cases?

Well, it turns out in this case, you only get in the data if it were serious enough that the police showed up and filed a report. And, in 96% of the cases, the car had to be towed away. So, imagine somebody just coming in, running with the data, totaling it, and going and reporting there's 25 deer/vehicle incidents in Pittsburgh in a given year. And then, being asked, 'Well, what's a deer-related incident?' and not having a good answer. Or worse, being confronted with this: 'Wait. Animal Damage Control reports that they picked up 600 deer carcasses last year.' And then, you've got two problems. You've got, One: You've created confusion with your statistics. Your statistics were supposed to add clarity. Now people are confused. Is it 600? Is it 25?

And then, the second problem is: If you don't have an answer to reconcile the two, you've got a credibility problem. Here you are coming in as the data guy, the data person, and now we can't believe anything out of your mouth.

And, the error, again, isn't--it's not that you calculated the total wrong. It's not that you manipulated the data. You just weren't thinking much about the data and the number you actually calculated. And, that would happen in CEA.

In fact, very early on, I met with Casey Mulligan, one of your prior guests. And, Casey is such a careful guy. And so, he was a bit of a quality control check on people. And, I gave him a memo with some numbers in it. And, I remember him asking me about one of the numbers: 'Is this one time or this a flow?' It was a dollar value. And, it was embarrassing: I didn't have a good answer. It's $25 billion. It's $25 billion. One year? Every year? Basic question. I wasn't well-prepared to answer it. And, that was just a reflection of Casey being a careful, thoughtful guy and also aware of the audience.

The other thing that he did that was helpful: There were several numbers there, and there was a GDP [Gross Domestic Product] output number, and there was another number that was more of a welfare number. And, he's like, 'Let's just stick with the GDP number. The people we're talking to, they get output, they get production. They don't understand opportunity costs. Let's not distract them with this welfare number that they're not going to know how to reconcile the two numbers.' And, that was a thoughtful incorporation of: Who are we speaking to? What are they going to understand? So, it was both a interpretation of the number, knowledge of the audience that Casey was teaching me to be sensitive to, and I've taken that forward and now helping students interact with the city about deer.

Russ Roberts: That's awesome.

1:06:28

Russ Roberts: I want to close with a question about civic education, and understanding, and the political process. Alexander Pope said: "A little learning is a dangerous thing."

And, it has become popular in recent years to require high school students to learn statistics. You know: Statistics are everywhere. It should be part of basic education.

My guess is most of those classes are dangerous. A little learning is a dangerous thing. They get sort of a cookbook course, maybe. They don't get a very thoughtful class. And, they're taught by people who don't understand the complexity of randomness, the complexity of data, causation, multivariate issues.

And your book is an attempt to improve that. Your book is an attempt to help thoughtful people get a better understanding of complexity, of measuring things, and how those measurements should be interpreted.

It's interesting to me how little of that there is in the world. Part of that is because it's really hard to write a book like this. Most people who understand the concepts are unable to explain the concepts and therefore they can't do it well. And, part of is because I'm not sure how much demand there is. Most people are more comfortable just being told what the truth is. They don't really want to look and see whether the support and evidence for it is reliable or not.

But, as thinking human beings, it seems to me that if we want to be civilized and educated, the book that you've written and books like it that are yet to be written should play an important role in being a fully developed person. Because, the world is a complicated place; and statistics are one way to access that complexity. React to that.

Jeremy Weber: Well, I appreciate the comment about the book. That is the hope. I talked about students and advanced students as being the audience, but really the audience is broader, and I would take it really anyone who has some basic statistical education and wants to think more carefully about statistics. So, it does have that broader audience in mind, and it does have that aim of thoughtfulness and the broader good that that can bring about.

As far as: is it a bad idea to teach statistics to high school students or at some--teach it poorly, I mean, anything taught poorly can be problematic because it's kind of worse than not having been taught it, because you go out thinking you know it, and so you are in a sense inoculated to the real thing. You've heard the prosperity gospel, and then you confuse it with the true gospel. And so, every time you use the word 'gospel,' you are thinking a certain thing and it's the exact opposite of the true thing. So, it's worse than being a blank slate.

That said, my experience in teaching introductory students--I mean, these are Master's students, but they're nonprofit-focused. They're policy-focused. They're not coming in because they're generally stat-focused people. They're taking this class because they have to. They're intimidated.

I have found that students, I think, generally do want to get it right. There is a side that's, like: 'Just give me the numbers to--I already know how the world works and what the policy should be, and now I just want the numbers to back it up.' But, I've also seen students get really excited about learning from statistics. That is, being surprised by them and having the confidence, having thought a little more deeply about the statistics that: Hey, this can't be dismissed. We have to deal with it. We understand the number, where it comes from, and we have to deal with it.

And, maybe they change their view or they come to appreciate: 'Oh, going back to the data, tethering our broader claims to observation about the world is a good, confidence-building exercise.'

And so, I do have a side of me that's hopeful that good statistical education can bring about better insight. More prioritization of main causes from tertiary causes. A greater understanding of what's a real problem and what's a problem that just is because people are talking about it, not because it's a real problem.

But, I'm fully aware of the challenge. And, things can go wrong very easily. So, I appreciate your skepticism. I share much of it, but I think there's no other way. You got to get in the statistics game and you got to do it better because somebody's going to do it, and they might as well do it well and thoughtfully.

Russ Roberts: My guest today has been Jeremy Weber of the University of Pittsburgh. He is the author of Statistics for Public Policy. Jeremy, thanks for being part of EconTalk.

Jeremy Weber: Thanks, Russ. It's been a pleasure.

How to Avoid Lying With Statistics (with Jeremy Weber)

John Ioannidis on Statistical Significance, Economics, and Replication

Don Cox on the Economics of Inheritance

READER COMMENTS

Tomi Lahcanski

Mar 5 2024 at 8:00pm

Ben

Mar 6 2024 at 10:02am

Lauren Landsburg

Mar 8 2024 at 6:39am

J Mann

Mar 8 2024 at 12:00pm

Ron Spinner

Mar 14 2024 at 8:35am

David Gossett

Mar 17 2024 at 8:38pm

How to Avoid Lying With Statistics (with Jeremy Weber)

John Ioannidis on Statistical Significance, Economics, and Replication

Don Cox on the Economics of Inheritance

READER COMMENTS

Tomi Lahcanski

Mar 5 2024 at 8:00pm

Ben

Mar 6 2024 at 10:02am

Lauren Landsburg

Mar 8 2024 at 6:39am

J Mann

Mar 8 2024 at 12:00pm

Ron Spinner

Mar 14 2024 at 8:35am

David Gossett

Mar 17 2024 at 8:38pm

Enter your email address to subscribe to our monthly newsletter: