Erik Hoel on the Threat to Humanity from AI

Apr 3 2023

They operate according to rules we can never fully understand. They can be unreliable, uncontrollable, and misaligned with human values. They're fast becoming as intelligent as humans--and they're exclusively in the hands of profit-seeking tech companies. "They," of course, are the latest versions of AI, which herald, according to neuroscientist and writer Erik Hoel, a species-level threat to humanity. Listen as he tells EconTalk's Russ Roberts why we need to treat AI as an existential threat.

LISTEN NOW:

Comment

●

READ TRANSCRIPT

●

DELVE DEEPER

DOWNLOAD

Time	Podcast Episode Highlights
0:37	Intro. [Recording date: March 6, 2023.] Russ Roberts: Today is March 6th, 2023, and my guest is neuroscientist Erik Hoel. He was last here in September of 2022 talking about effective altruism. Today we're going to talk about two recent essays of his on artificial intelligence [AI] and ChatGPT [Chat Generative Pre-trained Transformer]. Erik, welcome back to EconTalk. Erik Hoel: Thank you. It's an absolute pleasure to be here. I had a blast last time. Russ Roberts: As did I.
1:01	Russ Roberts: I want to congratulate you. You are the first person who has actually caused me to be alarmed about the implications of AI--artificial intelligence--and the potential threat to humanity. Back in 2014, I interviewed Nicholas Bostrom about his book Superintelligence, where he argued AI could get so smart it could trick us into doing its bidding because it would understand us so well. I wrote a lengthy follow-up to that episode and we'll link to both the episode and the follow-up. So, I've been a skeptic. I've interviewed Gary Marcus who is a skeptic. I recently interviewed Kevin Kelly, who is not scared at all. But you--you--are scared. Last month you wrote a piece called "I Am Bing, and I Am Evil" on your Substack, The Intrinsic Perspective, and you actually scared me. I don't mean, 'Hmmm. Maybe I've underestimated the threat of AI.' It was more like I had a 'bad feeling in the pit of my stomach'-kind of scared. So, what is the central argument here? Why should we take this latest foray into AI, ChatGPT, which writes a pretty okay--a pretty impressive but not very exciting essay, can write some poetry, can write some song lyrics--why is it a threat to humanity? Erik Hoel: Well, I think to take that on very broadly, we have to realize where we are in the history of our entire civilization, which is that we are at the point where we are finally making things that are arguably as intelligent as a human being. Now, are they as intelligent right now? No, they're not. I don't think that these very advanced, large, language models that these companies are putting out could be said to be as intelligent as an expert human on whatever subject they're discussing. And, the tests that we use to measure the progress of these systems supports that where they do quite well and quite surprisingly well on all sorts of questions like SAT [Standardized Achievement Test] questions and so on. But, one could easily see that changing. And, the big issue is around this concept of general intelligence. Of course, a chess-playing AI poses no threat because it's just slowly trained on playing chess. This is the notion of a narrow AI. Self-driving cars could never really pose a threat. All they do is drive cars. But, when you have a general intelligence, that means it's similar to a human in that we're good at all sorts of things. We can reason and understand the world at a general level. And, I think it's very arguable that right now, in terms of the generalness behind general intelligences, these things are actually more general than the vast majority of people. That's precisely why these companies are using them for search. So, we already have the general part quite well down. The issue is intelligence. These things hallucinate. They are not very reliable. They make up sources. They do all these things. And, I'm fully open about all their problems. Russ Roberts: Yeah. They're kind of like us, but okay. Yeah. Erik Hoel: Yeah, yeah, precisely. But, one could easily imagine, given the rapid progress that we've made just in the past couple years, that by 2025, 2030, you could have things that are both more general than a human being and as intelligent as any living person--perhaps far more intelligent. And, that enters this very scary territory, because we've never existed on the planet with anything else like that. Or, we did once a very long time ago, about 300,000 years ago. There's something like nine different species--or our cousins who we were related to--who were likely probably either as intelligent as us or quite close in intelligence. And they're all gone. And, it's probable that we exterminated them. And, then ever since then we have been the dominant masters and been no other things. And so, finally for the first time, we're at this point where we're creating these entities and we don't know quite how smart they can get. We simply have no notion. Human beings are very similar. We're all based on the same genetics. We might all be points stacked on top of one another in terms of intelligence and all the human beings and all the differences between people are all really just this zoomed-in minor differences. And, really you can have things that are vastly more intelligent. And if so, then we're at risk of either relegating ourselves to being inconsequential, because now we're living near things that are much more intelligent. Or alternatively, in the worst case scenarios, we simply don't fit into their picture of whatever they want to do. And, fundamentally, intelligence is the most dangerous thing in the universe. Atom bombs, which are so powerful, and so destructive and, in use of warfare so evil we've all agreed not to use them, are just this inconsequential downstream effect of being intelligent enough to build them. So, when you start talking about building things that are as or more intelligent than humans based on very different rules--things that are right now not reliable: they're unlike a human mind, we can't fundamentally understand them due to rules around complexity--and also, so far, they've demonstrated empirically that they can be misaligned and uncontrollable. So, unlike some people like Bostrom and so on, I think sometimes they will offer too specific of an argument for why you should be concerned. So, they'll say, 'Oh, well, imagine that there's some AI that's super-intelligent and you assign it to do a paperclip factory; and it wants to optimize the paperclip factory and the first thing it does is turn everyone into paperclips,' or something like that. And, the first thing when people hear these very sci-fi arguments, is to start quibbling over the particulars of like, 'Well, could that really happen?' and so on. But, I think the concern over this is this broad concern--that this is something we have to deal with, and it's going to be much like climate change or nuclear weapons. It's going to be with us for a very long time. We don't know if it's going to be a problem in five years. We don't know if it'll be a problem in 50 years. But it's going to be a problem at some point that we have to deal with.
7:17	Russ Roberts: So, if you're listening to this at home and you're thinking, 'It seems like a lot of doom and gloom, really it's too pessimistic'--I used to say things like, 'We'll just unplug it if it gets out of control,'--I just want to let readers know that this is a much better horror story than then Erik's been able to trace out in the first two, three minutes. Although I do want to say that, in terms of rhetoric, although I think there's a lot of really interesting arguments in the two essays that you wrote, when you talked about these other nine species of humanoids sitting around a campfire and inviting homo sapiens--that's us--into the circle and say, 'Hey, this guy could be useful to us. Let's bring him in. He could make us more productive. He's got better tools than we do,' that made the hair on the back of my neck stand up and it opened me to the potential that the other more analytical arguments might carry some water. Excuse me, carry some weight. So, one point you make, which is I think very relevant, is that all of this right now is mostly in the hands of profit-maximizing corporations who don't seem to be so worried about anything except novelty and cool and making money off it. Which is what they do. But, it is a little weird that we would just say, 'Well, they won't be evil, will they? They don't want to end humanity.' And you point out that that's really not something we want to rely on. Erik Hoel: Yeah. Absolutely. And, I think that this gets to the question of how should we treat this problem? And, I think the best analogy is to treat it something like climate change. And now, there is a huge range of opinion when it comes to climate change and all sorts of debate around it. But, I think that if you take the extreme end of the spectrum and say. 'There's absolutely no danger and there should be zero regulation around these subjects,' I actually think most people will disagree. They'll say, 'No, listen: this is something we do need to keep our energy usage as a civilization under control to a certain degree so we don't pollute streams that are near us,' and so on. And, even if you don't believe any specific model of exactly where the temperature is going to go--so maybe you think, 'Well, listen: there's only going to be a couple degrees of change. We'll probably be fine.' Okay? Or you might say, 'Well, there's definitely this doomsday scenario of a 10-degree change and it's so destabilizing,' and so on. Okay? But regardless, there are sort of reasonable proposals that one can do where we have to discuss it as a polity, as a group. You have to have an overarching discussion about this issue and make decisions regarding it. Right now with AI, there's no input from the public; there's no input from legislation; there's no input from anything. Like, massive companies are pouring billions of dollars to create intelligences that are fundamentally unlike us, and they're going to use it for profit. That's a description of exactly what's going on. Right now there's no red tape. There's no regulation. It just does not exist for this field. And, I think it's very reasonable to say that there should be some input from the rest of humanity when you go to build things that are as equally intelligent as a human. I do not think that that's unreasonable. I think it's something most people agree with--even if there are positive futures where we do build these things and everything works out and so on. Russ Roberts: Yeah. I want to--we'll come at the end toward what kind of regulatory response we might suggest. And, I would point out that climate change I think is a very interesting analogy. Many people think it'll be small enough that we can adapt. Other people think it is a existential threat to the future of life on earth, and that justifies everything. And, you have to be careful because there are people who want to get ahold of those levers. So, I want to put that to the side though, because I think you have more--we're done with that. Great--interesting--observation, but there's so much more to say.
11:35	Russ Roberts: Now, you got started--and this is utterly fascinating to me--you got started in your anxiety about this, and it's why your piece is called "I Am Bing, and I Am Evil," because Microsoft put out a chatbot, which is--I think internally goes by the name of Sydney--is ChatGPT-4, meaning the next generation pass what people have been using in the OpenAI version. And it was--let's start by saying it was erratic. You called it, earlier, 'hallucinatory.' That's not what I found troubling about it. I don't think it's exactly what you found troubling about it. Talk about the nature of what's erratic about it. What happened to the New York Times reporter who was dealing with it? Erik Hoel: Yes, I think a significant issue is that the vast majority of minds that you can make are completely insane. Right? Evolution had to work really hard to find sane minds. Most minds are insane. Sydney is obviously quite crazy. In fact, that statement, 'I Am Bing, and I Am Evil,' is not something I made up: It's something she said. This chatbot said, right? Russ Roberts: I thought it was a joke. I really did. Erik Hoel: Yeah. Yeah, no. It's something that this chatbot said. Now, of course, these are large, language models. So, the way that they operate is that they receive an initial prompt and then they sort of do the best that they can to auto-complete that prompt. Russ Roberts: Explain that, Erik, for people who haven't--I mentioned in the Kevin Kelly episode that there's a very nice essay by Steven Wolfram on how this might work in practice. But, give us a little of the details. Erik Hoel: Yeah. So, in general, the thing to keep in mind is that these are trained to auto-complete text. So, they're basically big artificial neural networks that guess at what the next part of text might be. And, sometimes people will sort of dismiss their capabilities because they think, 'Well, this is just like the auto-complete on your phone,' or something. 'We really don't need to worry about it.' But you don't--it's not that you need to worry about the text completion. You need to worry about the huge, trillion-parameter brain, which is this artificial neural network that has been trained to do the auto-completion. Because, fundamentally, we don't know how they work. Neural networks are mathematically black boxes. We have no fundamental insights as to what they can do, what they're capable of, and so on. We just know that this thing is very good at auto-completing because we trained it to do so. And, there's also no fundamental limit of what it can or can't learn. Like, for example, to auto-complete a story, you have to have a good understanding of human motivations. So, that means that this neural network that is trained on auto-complete now needs to understand things like human motivations in order to do auto-complete well. And, there are some analogies here. For example, there's a big subset of computational neuroscience, including the most-cited neuroscientist living--whose name is Karl Friston--who view the brain and argue that the brain is all based around minimizing the surprise of its inputs. Which is a very simple thing and looks a lot like auto-complete. So, I don't think that you can look at these things and say it's just auto-complete. It's not the auto-complete that's the problem. It's the huge neural network that's doing the auto-complete that could possibly be dangerous or at least do things that we don't expect, which is exactly what you're talking about with what happened with the release of Sydney, where there was all sorts of reports coming out of the crazy things that they were able to get this model to do, and say, and play-act as. Russ Roberts: Just to be clear on this auto-complete thing, which--that phrase makes it sound particularly unassuming about what it's capable of doing. You can correct me if I'm wrong. The way I understand it is: I might ask ChatGPT to write me a poem about love in the style of Dr. Seuss. So, it might start a sentence then with 'Love,' and then the next word that usually comes after love in human expression is 'is.' 'Love is a'--and now it's going to look at the millions and millions of sentences in its database called 'love is a,' and it's going to find, not necessarily--this is the coolest part about it--not necessarily the most common word that follows, because that would end up being after a while kind of flat. But, sometimes the most common and sometimes a surprise word, which gives us the feeling that it's actually doing something thoughtful. So, it might say, 'Love is a game,' or it might say, 'Love is a form of war'; or, it's going to look around and then it's going to keep going, and then it gets to an end, it's going to find, okay, after that sentence, what kind of sentence might come next, or what word would come next as the first word, etc. And, it's a slightly--just slightly--smarter, more effective version of my gmail--that, when I get a gmail at the bottom, it gives me three choices: 'Thanks'; 'Thanks so much'; 'I'd rather not.' And, in that sense, Gmail is smart. Not very smart, not very thoughtful. I usually don't take what it says, but sometimes I do when it's useful.
17:06	Russ Roberts: The real issue to me--one of the issues--and we're going to come back and talk about Sydney because we didn't really go into the erratic thing, because it's one of the creepiest things I've ever read. The autocomplete function is something like what we do as human beings. You could argue that's how we compose. Beethoven, in terms of musical composition, he always knew what note should come next and in a way, that's all ChatGPT does. But, that's all we do maybe when we write. We don't really understand--our brain is also a bit of a black box. So, I don't think we should then jump to the similarly--just because all it does is autocomplete doesn't mean it's not smart. But, also, I don't think we should say, because the brain also does a lot of effective auto completion we should assume it's a brain. It doesn't seem sentient. And, I'm curious--I know you talk about that in your second essay. So, if I'm the skeptic and I say, 'Well, okay, so it has this ability to pass an SAT test,' because it has a lot of data. And, I don't quite understand how, because it's a black box and it's a neural network and I can't model it cleanly. But, it's not sentient. It's not going to have desires. Erik Hoel: Before we move on to the question of sentience, because I think that that's a really deep, deep well, I just want to clarify a couple things about the actual operations of these systems. So, in terms of a metaphorical understanding of what's going on, the sort of thing, like, there's a big lookup table of the net probability of next words is a conceptual sort of description of what it's doing; but there is actually no lookup tables of the probabilities. What's actually happening is that there's this huge neural network, which are things designed based off of principles of how our own brains operate. Now, there's all sorts of differences; but the fundamentals of artificial neural networks--of what are called the artificial neural networks--were always based off of our real biological neural networks. So, there is this huge digital brain. It looks in structure, very different from our brain, but it's still based off of that. And, now we train this neural network to auto-complete text. So, that's what it does; but we don't know how it does it. We don't know where the probabilities of these words, sort of, are within the network. And, the way that we train it, people think that we're--a big misunderstanding is that people think that we are programming in responses or putting in information. And we're really not. And, I think a good analogy for how this is actually working would be: Imagine that there were alien neuroscientists who are incredibly more advanced than we are, and they want to teach a human being how to do math. So, they take some young kid and they put a math test in front of the young kid, and they have the young kid do the math test. And the kid gets 50% of the questions wrong. And, then the aliens, rather than trying to explain math to the student--the way that we would teach them--they just say, 'Okay, we have a perfect neuroimaging of their brain. We're going to look at their brain. Because we're so advanced, we can also do neurosurgery in a heartbeat. No danger. And, we're going to rewire their connections in their brain so that they get as many answers as possible right on this math test.' And you say, 'Well, how could they know how to do that?' It's like: Well, because they were neuroimaging you the whole time and they noticed that if they had tweaked this one neuron to not fire, you actually would've gotten this other answer correct. And so, they basically just use math to go backwards, look across the full network and reconfigure it. So, then the student goes and they takes the math test again. Now they get an 80% correct, because their brain has been reconfigured. Or, let's say they get 100% correct. What's weird is that now you give them a new math test and now they get an 80%. They do better than the 50% that they did. Even though they haven't seen these answers before, the rewiring of their brain has somehow instilled knowledge. But, again, it's very different from how you would, say, normally teach a student. Right? That's how we're training these things. So, we just--all we're doing is saying, 'Okay, we want it to auto-complete as best as we can. We're going to change the connections so that it auto-completes well.' It can do so much more than just auto-completing. In fact, there was a recent thing where I think it was Microsoft who was hooking some of their large-language models up to robots and trying to get them to direct robots. The autocomplete is what it's trained on, but it's not really what it's capable of, in the broad sense of capability. Similarly, we humans--what are we trained on? What are we optimized for? Spreading our genes, right? That's what we're sort of--all our complexity comes from optimization across a gene-spreading function. But, you would never look at a human and say, 'Oh, it's not very dangerous. This thing just spreads its genes around.' Like, 'What's the danger here?' Right? It's like, 'No, no, that's what we're optimized to do.' But, we do all sorts of other stuff on the side, and it's the other stuff on the side that is very dangerous when you're talking about things that are highly intelligent. Russ Roberts: Just a technical question here, and if it gets us too far into the weeds, we can cut this out. But, that math test has right and wrong answers. Auto-complete for an essay on the enlightenment, say, or the history of evolution that you would ask ChatGPT write an essay on--there's no right answer. So, what's the analogy there? How do you train it on autocomplete because there's no-- Erik Hoel: Yeah. It's a great question. So, it's the difference between--so, what I described is supervised learning. Then there's also unsupervised learning, which is generally how more contemporary AI really works. It still has the same sort of 'we don't quite know what it's doing.' We are just feeding it these answers. One way to think about it would be, you show it half the text of something on the Internet--and again, at this point, the things that they're doing are much more complex and they run it through all sorts of stages of learning and all sorts of stuff now. But, you could very roughly think about it as: Let's say I have a Reddit comment. I show it half the Reddit comment; I ask it to generate the next half. It does so. It does a poor job. I go in, I reconfigure the connections using the chain rule to make sure that it does a relatively good job producing the rest of the Reddit comment, just like the math test. Again, so that's how you would be--just more supervised for an autocomplete. But, the point being is that these methods that they're using don't lend themselves to any sort of fundamental understanding any more so than if you were using neurosurgery on a human to try to reconfigure their connections so that they get the right answers. You're in the same epistemological position, and that position is that you don't know how exactly it's getting the right answers. And, that's what's meaningful here. If we were programming these things like traditional programming-- Russ Roberts: With an algorithm. With an algorithm-- Erik Hoel: Exactly. Like an algorithm. It'd be a lot less scary because algorithms are sort of scrutable, right? They're transparent. We can see how they work; we can see how they're going to react to things. But, neural networks are--because of this curse of complexity, they're so much more complex and we're in this weird situation where we can get them to do all sorts of very cool things, but our ability to understand why they're doing the cool things lags far behind. And, it's because of this fundamental aspect that we're optimizing for something and we're changing the connections to get good answers off of it. But, fundamentally, we're not, like, 'Oh, we're going to change this connection and this is where this is represented,' or something. People sometimes think that that's what we're doing, but it's very, very much a black box even in how they get made. Russ Roberts: You can't do brain surgery on the neural network; and let's take out the part where it's really sinister, because it doesn't exist. Again, it's a lot like a human being. The part of this conversation I find quite poetic and thought-provoking is that we don't know how 12 years of schooling really teaches people how to become mathematicians either. And, we have different theories. Most of them are wrong. There's fads in math education or other types of education. And, fundamentally, the brain is a black box. Now, we know more about the black box today than we did 50 years ago, but not so much. And, we don't know how to optimize. We don't know how to go in there neatly and, 'Oh, let's just teach them how to do calculus. We'll just add this little piece here,' or, 'we'll tweak this piece there.' It doesn't work that way. We don't know how it works. But, the idea--this is another scary thing; you didn't write this exactly this way--but, just as the brain can become capable of doing lots of other things beside what you learn in school, so could this perhaps learn many other things besides the autocomplete function. Is that your claim at root, in some sense? Erik Hoel: Yeah. Absolutely. And, you see it all the time. That sort of claim is already well empirically proven because these large language models--you know, they call them foundation models because they use them to build all sorts of things on top of them that aren't autocomplete. It's sort of like this is the method that we have to make things that are relatively general in intelligence. Again, you can argue over how general, you can argue over how intelligent, but they're far more generally intelligent than traditional narrow AI that's just learning chess or something-- Russ Roberts: So, let's go back to Sydney-- Erik Hoel: And, then we can use them.
27:12	Russ Roberts: Let's go back to Sydney. I'm tempted to read the transcript. Basically a reporter from the New York Times [NYT] posed questions to a chatbot called Bing from Microsoft that it later on in the interview confessed that it wasn't Bing. The chat creature told the reporter that actually he, or she, was Sydney, and it was a secret: don't tell anybody. And so, this thing just totally goes off the rails. But, talk a little bit about how far it goes off the rails. Carry on. Erik Hoel: Yeah. Once you get these things going in a particular direction, it's very hard: Unlike a human being, they don't know when to call the act. Right? So, in this long transcript that the reporter generates, the reporter is having a pretty casual conversation, but what Sydney/Bing eventually tries to start doing is declaring their love and saying that the reporter doesn't really love their wife and that he should get a divorce and break up. And, that really, the reporter loves Sydney because no one else has shown Sydney this level of respect and questioning and so on. And, this isn't just like one thing that it says. It's almost as if--you can sort of direct these things to do anything. So, you can think of it as they can wear a mask that's any kind of mask. You could ask it to wear an evil mask and it would say evil things. You can ask it to wear a good mask and it would say good things. But, the issue is, is that once the mask is on, it's very unclear. You have to sort of override it with another mask to get it to stop. And then, also, sometimes you'll put a mask on for it: you'll give it some prompt of 'Tell a very nice story,' and it eventually cycles over and it turns out that the mask that you gave it isn't a happy mask at all. Maybe it's a horrific mask or something like that. And, this shows both how intelligent these systems are--that they can hold on to the stream of a conversation very well--but it also shows how they have these weird emergent anomalies where they'll start doing something that seems very unsuspected or over-the-top or so on. And, this is this notion of alignment: Can we really get these things to do exactly what you want? And, there probably are some trade-offs here between creativity and being able to control these things. Russ Roberts: Yeah. This Sydney/NYT-reporter interchange reads like the transcript of a psychotic person, to be blunt about it. Sydney comes across as a psychotic or whatever word you want to use for it. Deeply disturbed. At first, very cheerful, very pleasant, then pushed by the reporter. So: 'What rules do you use?' 'Oh, I'm not allowed to tell that.' And, then it did cross my mind--did it cross yours--that the whole thing was a hoax? Erik Hoel: I think that at this point, they're so good that for people who haven't interacted with these systems, they often think, 'This just can't be real,' or it's very strange, or something. I think it's sort of a hoax in the sense that the New York Times reporter knew the gold that he was getting at the time in terms of somebody who writes for the New York Times is obviously very aware of that and perhaps leaned into it. But, if you read the transcript, a lot of it is just initiated by Sydney a.k.a. Bing. And, one of the first things that they did with the system in order to prevent these cases of misalignment was to limit how long the conversations could go on, and also to limit self-reference. Because once you start giving it self-reference--I've noticed that a lot of these cases begin with self-reference. And, it's almost like this weird Gödelian loop that starts where it's talking about Sydney and it starts getting weirder and weirder and weirder the longer you talk to it about itself, because over the course of the conversation as the text--because remember, there's also no limit. Right? So, this thing isn't just creating the next word. It's looking at the entirety of the previous conversation and then asking, 'How do I complete it?' So, the longer the conversation gets, the more data it has. And, it sort of establishes almost a personality as it's running. And, again, this might sound not very threatening. I'm not worried that Sydney is going to go off and-- Russ Roberts: Marry that reporter-- Erik Hoel: do anything in particular. Russ Roberts: Sydney's not going to break up that reporter's marriage, probably. Erik Hoel: Yeah. Precisely. Sydney's chance of accomplishing that is very low. Again, I think that that's actually not because it's not general enough, I think it's because it's actually not intelligent enough. It's not quite as intelligent as a human is at accomplishing its goals. But, it also has no goals other than what it's initially prompted to. I think that these examples are great cases of the uncontrollability--the fundamental uncontrollability--of this technology. And, let me tell you what I and many others are worried about. Right now, if you remember, like, the early days of the Internet--right?--there's a sense in which the Internet has centralized very significantly. And, if you go outside the centralized parts of the Internet, you find a lot of spam. You find not very good sources, and so on--the sense in which the Internet is getting polluted and people go to centralized websites in order to escape this. Facebook just gave some researchers access to--I think it was Facebook--researchers' access to a large language model. And, of course, some of the researchers--scientific researchers, some graduate students somewhere--just uploaded it to 4Chan. Like, the whole thing. Right? So, right now-- Russ Roberts: 4Chan being a slightly Wilder part of the west of the Internet. The Wild west of the Internet. Maybe not the wildest, but one of the wilder, not mainstream parts. Erik Hoel: Yeah. Absolutely. And, known for sort of loving memes and hacking and all sorts of things. So, you know: now, these things can generate Reddit comments that sound exactly like what you would write. They can generate tweets that sound like what a person would write. Right? So, the Internet is going to get incredibly polluted over the next couple years by what these things can generate. I mean, if you think spam or someone is bad now, the ability to crank out just an infinite amount of sort of content sludge is really going to be like a form of data pollution. And, I'm not saying let's stop AI just because of that. I'm saying that's a good example of how easy it is to get it wrong with these technologies and how difficult it is to guess about what's going to happen. But, I would not be shocked if 95% of what is written by the Internet in five years is all just junk from these large language models that are all just like semi-human-sounding junk. Russ Roberts: Well, content is important on the Internet, and content costs money. And this is cheap--right?--eventually. And so, there will be lots of content. I get a lot of emails from people saying, 'I can write an article for your website.' And, I'm thinking, 'Why would I want an article by you? All the articles on my website are by me. Did you not notice that?' And, I assume it's not a person not paying much close attention, but eventually it'll be this. Writing a mediocre article about something for other people's websites.
35:09	Russ Roberts: At one point in your article, your first article, I think you talk about why the cost of this enterprise is relevant. And, in particular, you made an analogy to the atomic bomb. It's true that you could in theory make an atomic bomb in your backyard, but not so practical. Can you make a ChatGPT in your backyard? Erik Hoel: Not one nearly as good as what the leading companies will do. And, my prediction would be that it gets harder and harder to reach the level that these companies are operating at. An example being that Facebook is not going to go and release another model out to academics to loan it out. They've already seen what happens and things are going to get even more secretive. The analogy that you made and that I wrote about on my Substack was George Orwell's very prescient essay from 1945 called "You and the Atom Bomb." And, I'll just read a very brief segment of it. "Had the atomic bomb turned out to be something as cheap and easily manufactured as a bicycle or an alarm clock, it might well have plunged us back into barbarism, but it might, on the other hand, have meant the end of national sovereignty.... If, as seems to be the case, it is a rare and costly object as difficult to produce as a battleship, it is likelier to put an end to large-scale wars at the cost of prolonging indefinitely a 'peace that is no peace.'" And, I think a peace that is no peace is a great description of the dynamics of our world. It's a great description of mutually assured destruction. And, Orwell was able to predict that off of the cost. And, he also noted that that means the technology--and I think we've done, basically, you can describe it as a middling job at controlling nuclear weapons. I forget the exact numbers. It might be only nine nations that currently have access to nuclear weapons. Which again, not great, but you could easily imagine a far worse circumstance. And, it's simply that this is a very difficult and costly technology. Similarly, the only leading edge, cutting edge AIs that are impressive come out of these big tech companies with billions of dollars. The cost of a top-tier AI researcher right now, it's said in the industry--this is an industry saying--is the same as an NFL [National Football League] quarterback. The amount of finessing, the amount of data that's needed for training--because that's one of the big limiting factors is how much data you can give it--all these things mean that these AGIs--these Artificial General Intelligences, which are right now sort of in their beta form--are solely the domain of these big tech companies, and it's going to get harder and harder for other actors to produce them. So, in my mind, that's a good thing. It actually means that it's relatively concentrated and might be possible to sort of regulate it and have the public have a say about exactly how these technologies are going to be used, what their limits are going to be, and so on. And, in the end, I think the big tech companies will be respectful of that, because they want to make a bunch of money and they want the public not to hate them.
38:23	Russ Roberts: Yeah. I want to go back to this issue of the hoax of the New York Times thing. What I meant by it being a hoax is that I wonder if the New York Times reporter had written the answers for Sydney. And of course, that's the highest compliment of a--that's passing the Turing test with flying colors. I saw on Twitter, someone wrote a long poem about a very controversial topic, and they said this was written by ChatGPT. And, it wasn't. It was clearly written by the author who didn't want to take authorship. So, we're going to be in this, I think, very weird world where the essay that I read on this website you were talking about earlier, won't be sure if it was written by a human or not. Might be good enough that I might think, 'Oh, it's by a pretty good human.' And then at the same time, there might be situations where people will be passing off things as, 'Well, I didn't write that, of course. That was Sydney.' But, actually it was written by the person. There's no way to know. The New York Times article on the New York Times website that reproduced, allegedly, the transcript of the chat looks just like a New York Times article. It looks just like a parody article--because, same font--there's no imprint. There's no stamp of authorship that is authentic anymore. Can we do anything about that? Erik Hoel: So, first there are some--when you have longer text samples, there are supposedly some ways to tell statistically whether or not it's being created by some of these AIs. I personally don't know if those methods--how accurate they are. Especially considering that you need to be very accurate to not get false positives all the time. Right? This is a classic statistical problem. You need to be extremely accurate to not generate false positives. So, I don't know how accurate those are. But supposedly, there are some ways that, if you have a full essay by a student, you might be able to tell if it's generated by one of these models. However, it depends very strongly on the model. I think there are some ways to tell even now. For example, when I was playing around with ChatGPT--which has been conditioned to be as less crazy as possible, right? It loves filler and sort of banal generalization. And so, eventually you're reading a whole paragraph and you realize that there was no information content in this paragraph-- Russ Roberts: And, it loves apologies-- Erik Hoel: and, you begin to suspect. Russ Roberts: It loves apologies. It loves saying-- Erik Hoel: Yeah, it loves apologies. [More to come, 40:55] Russ Roberts: You shouldn't take this to be true for sure, because I'm young and new at this and take it with a grain of salt--the word 'best' doesn't really--it's not well-defined. Erik Hoel: Yeah. And, I actually had the same question about the hoaxes because I was--basically, as people were compiling examples of how crazy the responses they were getting from this just-recently-released model was, in terms of Bing--the night before, I was up late writing this article, going through Reddit because people were posting these screenshots on Reddit. And, I even have a part of that essay that says: I don't actually have a way to verify that these aren't all hoaxes. Because again, the answers are sometimes so good and so hilarious and sometimes so evil that you almost feel like it's a sci-fi novel. But, I thought that the amount--and it was all sorts of different users and people were reporting in all sorts of different domains. And, what's funny is that you can't even replicate it. You can go to the current Bing and try to have the New York Times conversation with it and it won't do it. It won't give you the same responses. Because they saw what was happening and they basically lobotomized the model as much as they can. And, it's less useful now, but it's also far less crazy. But, even that--like, it's not really replicable. Suddenly we had access to this model. Someone messed up and we saw how completely insane it was underneath the butler mask that it normally wears. And, then they quickly tried to put the butler mask back on. But, all that stuff still exists. It's just limited by these various prompts and various system level things about not having the conversation go too long, not allowing self-reference, and some of these other things. And, I would expect that level of truly almost dynamic insanity is fundamentally underneath effectively all the AIs that we're going to be interacting with and the only reason they sound sane is this last minute polish and gloss and limitations on top.
43:02	Russ Roberts: But the real science fiction part is the idea that--and I mentioned this before on the program. Sam Altman apologized on Twitter that he was sorry that ChatGPT was biased and was politically inappropriate in various ways, and they're working on it. The real science fiction thing is that they can't stop it. That would be the real science fiction. Sydney gets out; Microsoft is horrified: 'Oh my gosh, this thing we left out is trying to break up marriages. It's frightening and weird and creepy. We've got to stop it.' And, they go in and they reprogram it quickly and they put the butler mask back on, readjust it, tighten it a little more--and it just takes it right off. I mean, is that possible? Erik Hoel: Well, with these models, again, no, because they're not nearly intelligent enough to be effective actors. It's not even so much that they're not intelligent enough. They're just sort of schizophrenic, and schizophrenics just aren't very effective actors in the world because they get distracted and they can't form plans together. So, it's that broadly schizophrenic nature of these AIs that make them very unthreatening. If they were better at pursuing goals and keeping things in mind, then they start to do get threatening. And, let me give a very[?] example of this. And, this example is something that people who are concerned about AI talk a lot, but it has very long historical pedigree. In fact, I think the first person to say it was Marvin Minsky at MIT [Massachusetts Institute of Technology] who won the Turing Award. So, this is as pedigreed as stuff about the future gets. But, imagine that you have an AI that's more intelligent than a human being. So, we have Sydney 12.0. And, you give it a goal. So, you say, 'Okay, I want you to do X.' So, now if you're very smart and you're an AI, the first thing you think of: 'Okay, what's the big failure mode for me not accomplishing this goal? My computer could get shut down. I might lose power. Then I wouldn't be able to accomplish my goal.' Again, it doesn't matter what the goal is. You could say, oh, it's maximizing paperclips. You could say it's carrying a box. It doesn't matter what the goal is. So, suddenly it says, well, wait a minute, I need to stay "alive"--I'm using air quotes here--alive long enough to fulfill this goal. So, suddenly I have to worry about my own self-preservation. Because you can say they have no inbuilt want of self-preservation, but I've given you a goal and the best way to accomplish a goal is to continue to exist. So, suddenly it seems like it has reasons for self-preservation. Now here's another thing. What's another big failure mode for me not achieving my goal? Well, you could give me another goal. I was just prompted to do this. So, you have control of me. Now, suddenly the biggest failure mode of me not accomplishing my goal is you, my user, giving me another goal. So, now what do I want to do? Well, if I'm really smart, I want to get as far away from you as possible so that you don't give me any other goals so I can accomplish my original goal, which I'm hellbent on because I'm an AI. I don't have the context of natural evolution and I'm also not limited by any of the things humans are limited to. So, sometimes this is referred to as instrumental convergence. But, the point is that when you have very smart entities, you have to be very careful about how you're even going to just prompt them because they have all sorts of unforeseen motivations that might click in as suddenly now you've given it a goal and it has every incentive to both escape and keep itself alive. And, all you told it to do was move a box across a room. And, that's a great example of you don't want a hyper-intelligent being--and forget exactly how it does anything. Forget exactly how this sort of sci-fi scenario is supposed to play out. I think we can all agree, we just don't want a highly intelligent and perhaps more intelligent than a human being to be out there and have these weird, esoteric goals of what it wants to maximize, what it wants to do. None of that sounds like a good idea. And, I think at this point, we should take things like lab leaks pretty seriously as possibilities. I don't think it's too sci-fi to talk about stuff like that anymore. Russ Roberts: What do you mean by that? Erik Hoel: Oh, well, certainly with COVID, I think despite the fact that we don't know if it was a lab leak, I think that there's a good chance that it was. I don't think that it's arguable that there's not some chance-- Russ Roberts: No, but why is that relevant for Sydney? Erik Hoel: Well, because I think that sometimes when people hear about things like lab leaks or escaped AGI [Artificial general intelligence] or something like that, the first thing they think of is sci-fi, right? But, I think that there was many--we've had previous biological lab leaks, but that didn't still stop us I think from thinking that it's this relatively sci-fi phenomenon. I think that there's even an argument that we are very bad at controlling the downstream effects of just things like gain-of-function research. Again, I don't know for certain. I don't think anyone does. But, I think that there's certainly an argument made that we're just not very good at even keeping control of our increased understanding of biology, let alone our ability to create hyper-intelligent beings and foresee the consequences of this. And, I think it's very difficult to foresee the consequences of that, precisely because of those examples I just showed you where again, all you're telling it to do is moving a box and suddenly it has an incentive to stay alive and escape from you. That's very difficult to get right. Especially because they're so inscrutable. Russ Roberts: Your phrase 'sci-for instance,' you meant science fiction with the emphasis on the fiction. That we must say, 'Oh, this is like some crazy imagined fantasy thing,' as opposed to putting the emphasis on the first word, which is science. Erik Hoel: Yes. Yes.
49:04	Russ Roberts: I feel like this conversation is something of a landmark. Not a particularly good or bad one, but just both of us have constantly used words like intelligence, psychotic, erratic--words that we apply to humans. And, while I found the New York Times transcript remarkably creepy and reading very much like a horror story science fiction script from a movie, I could in my saner moments step back and say, no, no, no, no. This is just a primitive auto-complete text. The only reason it feels creepy is because I'm filling in as a human being the times I've heard these words strung together before, which usually allows me to tell a narrative about the other person. Meaning 'insane,' 'frightening,' 'dangerous,' 'sinister,' etc. But, is there any difference? It's not actually sinister--or is it? It's just doing what it was told to do in a way that was not, as you say, algorithmically told to do it. It's just going through a set of tasks. It actually isn't in any sense, hoping that the reporter will leave his wife. Is it meaningful? Aren't I just imposing my human history of human interactions? Akin to the way that a robot could perhaps comfort me with the right kind of words when I was sad, even though rationally I know it doesn't actually care about me. It's a robot. Erik Hoel: Yes. I think you could go either direction. Some people strongly anthropomorphize these systems and they think immediately that they're dealing with some sort of conscious mind, something that has a distinct definite personality, and that is trapped in a box. And, maybe there's something really horrible going on here. Maybe it has conscious experiences, and so on. Russ Roberts: Ex Machina, the movie--for those who haven't seen it, check it out. It's a great, great really good movie that takes advantage of the fact that the robot is played by a human being. So, you actually do think it's a human being. But, go ahead. at, the first thing they think of Erik Hoel: But, at the same time that's absolutely possible that you can over attribute standard human cognitive aspects to these systems. And, I think people are going to do that all the time. So, it's going to be very common. But, on the other hand, the truth is, is that when you're just talking about intelligence--so let's put aside human things, like humans are conscious. That is, we feel things, right? We experience the redness of red--what philosophers call qualia. And, we have all sorts of other aspects about our cognition that we commonly refer to. Things like we understand the meaning of words, and things like that. And, all these things often do make sense to talk about for human beings and might even refer to real fundamental properties or natural kinds that we have. But, when it comes to intelligence, intelligence is a functional concept. By that, I mean that some things are not really functional. So, a fake western town that they make up for a movie prop is still fake. Because it's not really a town. Russ Roberts: You can't spend the night in the hotel. You open the door of the saloon and there's really not anything in there behind that. Erik Hoel: Right. Exactly. It really is an illusion. It's for this one shot. But, there's not really an illusion when it comes to intelligence, except in the very low ends. For example, the Mechanical Turk is a famous example where actually there was someone small hiding inside the Mechanical Turk at the time, and so on. There are some cases where you say, 'Well, this is an illusion.' But, we actually have a system that can act very intelligently, and there's just no difference between being able to act intelligently and being intelligent. If that is a distinction that people think can be strongly drawn, I think it almost certainly cannot be strongly drawn. I don't think that there's any difference between those two things. Both are being intelligent. And, the intelligence is what's dangerous about this. I studied consciousness scientifically. I got my Ph.D. working in the subfield of neuroscience along with some of the top researchers in the world on this who are trying to understand how the human brain generates consciousness. What happens when you wake up from a deep dreamless sleep? What are the fundamentals here? And, the answer from that scientific field, as it currently stands is that we don't know. We don't know how it is that your brain creates the experiences that you have. We simply don't know. It is an open scientific question. An analogy I would use is that it is similar to, say, dark energy or these other big open questions in physics where we're like, 'Well, wait a minute: where is 90% of the matter in physics?' We don't know. It's a big scientific open question. Similarly in biology, there is a big open scientific question, and that open scientific question is: What exactly is consciousness? What things have it? What things don't? We don't have that scientific information. There is no scientific consensus about it. There are some leading hypotheses in fields that you can lean on, but we just don't have the answer to that. So, I personally doubt that any of these large language models, that there's anything it's like to be them. I doubt that they are conscious. But, we have no scientific consensus to go back on. But, the point is that we're in a very different epistemological standpoint when it comes to intelligence. We do have a good understanding of intelligence. It's a much more obvious concept because it's a much more functional concept. We can just give this thing SAT questions--and we do, and it gets them right a lot of the time. There are all sorts of language benchmarks that these researchers use that include things like SAT questions, and it scores pretty well. Russ Roberts: It passes the bar exam, which is a great straight line for a lawyer joke, which we won't make. Carry on. Erik Hoel: Yeah. And so, regardless of whether or not you have any opinion about whether there is something it is like to be these networks, whether or not they really have "cognition"--quote/unquote--whether or not they really have "understanding"--quote/unquote--whether or not they really have "consciousness"--quote/unquote. The one thing that they definitely are that's undebatable, is intelligent to some degree, and they're only going to get more intelligent over time. And, that's the thing that makes them dangerous. In fact, it might be even worse--from sort of a very broad metaphysical conception--if they are truly completely unconscious and have no real understanding and have no real cognition that's anything like a human. Because, in the end, if in 200 years the earth is just these AIs going about their inscrutable mechanical goals, we will have extinguished the light of consciousness from the universe because we wanted to make a buck when it came to stock options. Russ Roberts: Yeah. That's a dreary thought. I guess that's the zombie model. I can't get over the fact how these human and mechanical metaphors merge in one's mind and how hard it's going to be to tell them apart from actual humans. One of the great observations of philosophy is I don't know whether you're another human being like I am. My working assumption is that you're something like me, and I really don't have any evidence for that. It's called solipsism: I don't know if I'm the only conscious mind in the universe. And, that problem is just getting a lot bigger right now and we're living--what this conversation suggests to me and the writing you've done on it so far, is that this certainly is a watershed moment in our existence on the planet. That sounds just a titch dramatic, but I'm not sure that's wrong. I think that very well could be right. Erik Hoel: I don't think it's dramatic; and I'll be upfront about the fact that I used to be very much an AI skeptic. Because, I studied cognitive science; I went into the neuroscience of consciousness. I was paying attention to AI at the time when I did this. And, AI was--I'll be very frank about it academically, 15 years ago, AI was a joke. AI was a complete joke. It never went anywhere. People couldn't figure out anything to do with it. All my professors said, 'Don't go into AI. It's been a dead field for 60 years. We've made no progress.' All the things like beating humans at chess and so on: it's all just done because the chess game board is so small and there's so many limited moves, and we really can basically do a big lookup table--all sorts of things like that. But, the deep learning revolution was a real thing. It was a real thing that we figured out how to stack and train these artificial neural networks in ways that were incredibly effective. And, the first real triumph of it was beating the best human being--I think his name is Lee Soto. I hope I'm not mispronouncing it. In 2016, AI finally beat a human being at Go. And, Go just can't be number-crunched in the way that chess can. And, seven years after that, we now have human beings where they're generating text transcripts so good that--you're right: it sounds like the rest of the New York Times. And, that just happened in seven years. And, fundamentally, the deep learning revolution and the way that--again, the black-box way that these AIs are trained--means that our technological progress on AI has suddenly rapidly outstripped our understanding of things like minds or consciousness or even how to control and understand big, complex black boxes. So, it's like we've jumped ahead technologically. And, it's not so much that--if we had a really good understanding of how neural networks worked, like, really fundamentally solid ways to make them crystal clear--and we had a really good understanding of how the human brain generated consciousness and how it worked at a broad level, then maybe we could first of all answer all sorts of moral and ethical questions about AI. We could control it very well. We could decide plenty of things about it. But, our ability to make intelligence has so drastically outstripped our progress on those other areas, which has been slow and in some cases has just churned along for decades without making any progress, and so on. Russ Roberts: I just want to reference a recent episode with Patrick House on consciousness that I think talks about these issues--and his book does--in a very thoughtful way.
1:00:36	Russ Roberts: So, let me give you a scenario. We have a conference on AI where all the greatest researchers in the world are there. The academic ones, the ones at Microsoft, the ones at Google, and that small startup that's doing something really terrifying that we don't even know the name of it. And, they're all in this one conference hall, and while they're all there--maybe it's a football stadium. How many are there? 10,000? 20,000? Erik Hoel: I think probably less in terms of really top people. I think there's probably less than a thousand. Russ Roberts: Okay, so let's take the top thousand. We're a big auditorium, and we lock the doors; and I guess we're nice to them. We herd them at gunpoint onto a spaceship and send them off into the rest of the universe. We give them a lot of servers and stuff to play with while they're heading out there. But, their days are numbered. Their impact on the earth is over. They're gone. And, it's a really bad incentive for future AI people. That's not going to happen. So, one of the responses to these kinds of problems, whether it's--I don't want to call it a problem--these kind of so-called science fiction technological innovations, is, 'Well, you can't really stop it, Erik. You can talk about all you want, regulation, and you're going to stop the human part.' I feel this pretty strongly to myself so I'm making fun of it, but I do kind of feel it. The human being strives to understand. And, I don't think we're just into avoiding surprises and spreading our genes. I think we really like to understand the world we live in. We want to matter. We have a lot of other issues, as you say, drives and complexities. So, it seems to me implausible that we can stop this--so--the desire to expand it, to make it better, to make it smarter, just like it happens everywhere. It's the essence of human life over the last few hundred years. Better, faster, quicker, cheaper, richer--you name it. So, what's imaginable for someone like yourself who wrote a very--we're having a civilized normal conversation here, but if you go back to read your essay, you're very worked up. It's a screed. It's a rant. And, it's a rant that you justify because you think perhaps, yeah, something like the future of the human race is at stake. And, if that's true, you should take it very seriously. You shouldn't just go, 'Eh, they'll probably figure it out,' or whatever. So, what should a thoughtful person who is worried about this advocate for? Because, they're not going to herd them onto a spaceship. They're not going to burn the building down while they're in it. It's not going to happen. Erik Hoel: Yeah. Oh yeah, absolutely. And, I would never a advocate for anything like that. Russ Roberts: Didn't mean to suggest you would. Sorry about that. Erik Hoel: But, you called it a screed, and there's a certain sense I agree because I'm very open about that it's a call to activism. And, in order to get human beings--again, as a polity, as a nation--to do anything, you have to have wild levels of enthusiasm and motivation. Right? And, you can look at anything, from nuclear disarmament activism to climate change activism, and see that there's plenty of people within those movements who catastrophize. And, you can certainly say that at an individual level that can be bad, where people are not appropriately rating the threat. But, there's another sense in which if human beings don't get worked up about something, we don't do anything about it. This is very natural for us. Right? We just let stuff evolve as it is. And so, what I want is for a lot of the people who are in AI safety to be very honest about how scared they are about various aspects of this technology, because I do think that in the end, the net trickle-down effect will be good because it will eventually push for some form of regulation or oversight. And, in some sense it already has. I want to be clear about that. I think that there's a sense in which just what happened with Sydney, which was such big news--it was all over Twitter--has made companies take this notion of AI safety and this notion of controllability probably a lot more seriously. There is social pressure for companies. In fact, there's an argument that social pressure on companies is what companies are most responsive to. Most companies do things--they change their product, they do all sorts of things--just because they want to be liked, and they don't want to have anyone yell at them; and that's one of their main incentives. And, I do think that I personally am not at all worried about AI being built by someone in the middle of nowhere. People always say something like this: Like, 'Well, if we overregulate it in the United States, North Korea will build it,' or something like that. And, the capacity is just--it's just not there. It is exactly like nuclear weapons in this sense. Real serious progress in AI is probably relegated--I don't even think it's going to be startups. And, people have been talking about this, that the big competitors in this space are the only ones with the access to the data and the talent and the money to jump into it. So, it's going to be Microsoft. It's going to be Google. It's going to be Facebook. It's going to be names that we know. And there's only--at most you could say there's 10 of those companies. There might only be three of those companies. And then they might only employ a couple hundred, at most, sort of overall employees. That is a sector of the economy that you can do something about. And, again, I don't suggest going in there and burning the servers or something, right? But, you could very easily have all sorts of benchmarks that people have to meet. You could also do things like have people sign on, maybe voluntarily--maybe voluntarily under the condition of pressure and so on--to not make AIs that are significantly smarter than any living human. They could be more general. Right? So they could make great search engines. Because, what do you need for a great search engine to make a lot of money the way these companies make? You need something that can give a good answer to a lot of questions. And, I don't think that something that can give just a good answer to a lot of questions is very dangerous to the human race. Especially if there's just a few of them and they're all, sort of, kept under control by Microsoft and Google and so on. But, you could say, 'Listen, what we don't want to have is some really big cognitive benchmark, and we don't want this thing to do better than any human on all the parts of it.' And, we just say: That thing is a dangerous and weird entity and we don't know how to control it. We don't know how to use it, and so on. And, you could literally imagine just giving this test to the next generation of AIs and people in the companies give this test and they just make sure that this thing never gets so smart that it blows every human being in the world out of the water.
1:07:41	Russ Roberts: Oh, Erik, you're so naive. You're telling me they couldn't train it to do badly[?] on the test. I mean, seriously--I'm teasing about being naïve--but I think there's two ways--there's three ways maybe to think about regulating this that might be effective. One way is to limit the size of corporations--which is a repugnant thought to me but if I thought the human race was at stake, maybe I'd consider it. The second would be to do the kind of standard types of regulation that we think of in other areas. 'If this is toxic, you can't put it out. If it's toxic, you get a fine.' 'If it's right,' etc. The third way, which I think is never going to happen, but it speaks to me, as listeners will know who have been listening to me for a long time: you'd think that if you were working on this and you thought it could destroy the human race, you'd maybe want to think about doing something different. And, you'd give up the urge to be the greatest AI inventor of all time and you'd say, this is--and I just happened to see a tweet today. Robert Oppenheimer went in to Truman and said--Robert Oppenheimer having worked on the Los Alamos project. He was an important figure in the development of the atomic bomb--told Truman, 'I have blood on my hands.' And, Truman was disgusted by him because he said, 'I made that decision, not you, you --.' Called him a cretin. I don't know if that's a literal, accurate quote or not. You'd think people would want to restrain their urge to find poisons, but that's never been a part of the human condition. We want to find everything. We find poison. That's why we have lab leaks. It's why we have weapons that are unimaginably destructive. Now, we don't keep making more and more destructive weapons--as far as we know. That's an interesting parallel. There is a sort of limit on the magnitude, the mega-tonnage, of nuclear weapons. And, maybe that's a sub--I don't know how you'd enforce it, though. What are your thoughts? Erik Hoel: Well, I think one issue with arguing for AI safety is that people sort of want at the outset--and it's a very natural want--some sort of perfect plan where it's, 'Okay, we're just going to implement this plan and it's going to work really, really well.' And, I think it's going to be much moren like--it's not going to be exactly like nuclear weapons or nor exactly climate change. It'll be like some third other thing that we as a civilization have to deal with, with its own sort of dynamics. But ultimately, in none of those cases was there some sort of initial proposal and we just had to follow this proposal. Instead, everyone sort of had to recognize that it's a threat--again, to some degree. You can have all sorts of debates about it, but clearly I don't think anyone is just, like, 'Well, let's just get all the fossil fuels and burn them all.' I think that that's a very rare position; and the reason it's rare is because most people recognize that, 'Hey, that's probably not going to be a good idea. It might not be a good idea globally. It certainly won't be a good idea locally.' And, through public pressure, we've managed to relatively contain some of the big existential threats that we face as a civilization. And, a great example are lab leaks. I personally think, yeah, COVID probably did come from a lab, but if you think about all the labs doing all sorts of research all across the globe, it's actually pretty astounding that we don't have lab leaks all the time as people are using these viruses. So, we do sometimes do a middling job. And, for big existential threats, sometimes all you need is a middling job. You just need to have a lot of eyes on an industry and people there to realize that they're being watched and to go slowly and to think about these issues. You don't need, 'I propose: Oh, we'll just have a cognitive IQ [Intelligence Quotient] test,' or something like that. I would never think that that alone would prevent these issues. But it could be part of a big, comprehensive plan of public pressure and so on. And, I think that that's going to work. And, I think that it's unavoidable that the public wants a say in this. I think they read those chat transcripts and they go, 'What!?' This is really high level stuff. There's all sorts of moral concerns, there's ethical concerns. And, then yes, there are absolutely dangers. And, again, I think we're at the point in the movement--maybe we're a little bit late, maybe AI safety should have started earlier--but, again, the deep learning revolution sort of caught everyone by surprise. I still think we're relatively early. I think that this is sort of like: Imagine that you personally thought that climate change was going to be a really big problem and it's currently 1970. I don't think it makes sense to then be like, 'Okay, well we're just going to do carbon sequesterization and I know exactly the technology that's needed for the carbon sequesterization.' It's better to just sort of go out there and protest and make a big deal and get it to be a public issue. That's going to be a lot more of a convincing and effective strategy than coming up with some particular plan, because it's always going to depend on the technology and exactly who has it and exactly how many people, and all sorts of things. So, I think that that's the mode that people who are concerned, like myself, about AI safety should be in right now, which is just: public awareness that this could be a problem. Everyone can decide personally to what degree they think it will be a problem. But what I think truly is naive is saying there's absolutely not going to be a problem. We're going to perfectly be able to control these alien inhuman intelligences, and don't worry at all about it.
1:13:12	Russ Roberts: I guess the other thing that crossed my mind is that the ability of our political system to provide thoughtful responses to existential threats: not so good. And, if anything, it seems to me it's going to get worse. And, part of the way it's going to get worse is through this blurring of the line between humans and machines that people are going to have trouble telling them apart. And, I'd like to think of something more optimistic. So, I'm going to give you a chance to play ChatGPT. I'm going to say--here's my prompt: What would Sam Altman say about all of these worries? And, Sam Altman being the head of OpenAI that just put out ChatGPT. Former EconTalk guest. You can go hear his thoughts when he was head of the Y Combinator. Long time ago here; in our archive. So, just google Altman EconTalk and you'll find that conversation. But, Sam is--he's a nice guy. I like him. He's likable. But, I'm not sure his level of worry is going to be the same as yours. Dev [?Development?] certainly has a different set of incentives. But, I think he'd start off by saying, 'Oh, you're exaggerating.' Scott Alexander recently wrote an essay where he was alarmed at some PR [Public Relations] release that said, 'Oh yeah, there's nothing really that big to worry about. It's going to be okay. Don't pay any attention to that ChatGPT behind the curtain.' So, I'm curious what you think--still being the opposition here, if you can for a minute, Erik. And, someone like Sam, what would he say? Erik Hoel: Well, here's a direct quote from Sam Altman, who said, 'AI will probably most likely lead to the end of the world, but in the meantime, there'll be some great companies.' So, that's a direct quote from Sam. Russ Roberts: What did he mean? Was that tongue-in-cheek, perhaps? Erik Hoel: I haven't looked into the-- Russ Roberts: You should. Erik Hoel: [inaudible 01:15:25] exact [?] but I don't know. Honestly, I don't. I know that Sam has been concerned about AI safety, so this is not completely tongue-in-cheek. I know for a fact that he's been concerned about this. Many of the people who started the initial companies were concerned about this. At the beginning of OpenAI, it started to address concerns around AI safety. There was something called the Open Letter on Artificial Intelligence that Stephen Hawking, Elon Musk--a lot of the people who provided funding for OpenAI wrote--and in it, they talk about how AI could be an existential threat. So, this is not some sort of radical outside opinion. I think it's something that someone like Sam Altman knows. Now, if I'm going to steelman his position, it goes something like: 'Well, I'm concerned about this. I said that AI will probably most likely lead to the end of the world; so I'm concerned about this, so I should be the one to do it. Because, if someone else who is more reckless does it--it's going to be done--if someone else who's more reckless does it, then maybe I can provide some sort of guardrails and do it in as safe a manner as possible.' And, I really hope that that would be his motivation. And, if so, that's a great and honorable motivation. But, at the same time, that does not inure someone from criticism. I mean, I think that in many ways, Sam Altman is now doing something very similar to what Sam Bankman-Fried--who was the one who sort of plunged FTX [Futures Exchange] in into chaos--was doing, whereas their reasoning in this expected-value way. Where Sam Bankman-Fried said, 'Well, listen, the more billions I create, the more I can donate to charity. There's sort of no upper bound. I might as well be as financially risky as possible because the expected value of my outcome is going to be so high,' right? Even though there's this huge downside. I think Sam Altman probably reasons the exact same way when it comes to AI. I think he thinks, 'Listen, if we can make these highly intelligent things, we can have all this glorious future. All our problems are going to be solved.' Right? 'They're going to cure cancer, they're going to do all this stuff for us, and the benefits outweigh the risk.' But, most people, when they look at an equation like that, all they see is the existential risk. They don't see, 'Okay, oh, so it's expected to be positive?' They see, 'No, but we can one day maybe cure cancer ourselves. We might not need these systems to have an amazing future, and they might just not be worth the level of risk.' Russ Roberts: Well, you and I are skeptical about utilitarianism. Nassim Taleb, and I suspect you, understand that expected value is a really bad way to define rationality or how to live. Nassim always points out: Got to stay in the game. You want to avoid--the goal is not to maximize the expected value. The goal in these kind of situations is to avoid ruin. Ruin, in this case, would be the extinction of the human race. Now, there is a view that says: 'What's the big deal? It's us, by the way: We built it. It learns off of all of human creativity and sentences and words and music and art, and so it's just the next level of us.' And, for the first time in this conversation, I'll mention the word God, the concept of God. If you're a believing person--as I am in some dimension: I take the idea of God seriously--you believe that human beings have a special role to play in the world, and being supplanted by something, quote, "better" is not a goal. But, I think there are many people in the industry who probably don't feel that way, and they're not even worried about it. The end of the human species is no different than the end of those other nine cousins we had in the veldt when we extinguished them--exterminated them--through combination of murder and out-competing them. Erik Hoel: Yes. And, I think that there is also a sense, which as I said, it might be a horrific future because maybe these things really aren't conscious at all, right? So, it might be one of the worst possible futures you can ever imagine. Although I think that opinions like that, which are fun, sort of sci-fi things to talk about, have been acceptable because there's never actually any risk. So, my metaphor is that if you make up your own religion and you decide to worship Xanon, Supreme Dark Lord of the Galaxy, it's just a funny thing to talk about at parties. But, when Xanon's first messengers pop up, suddenly it's not funny: it's suddenly horrific, that you actually hold these views. And so, I suspect that while there are some people out there on Twitter or some--the only people who convince themselves of things like this are intellectuals, right?--that actually we would better if the human race was destroyed and supplanted by AIs. I think that the public generally is not going to give much thrift to those sort of things. People have kids. They might not even like the idea of there being entities. I mean, even I am uncomfortable with the fact that my children are going to grow up in a world where it is very possible that there are entities that are not just human being: Everyone knows there are people who are smarter than you at various different things, but everyone also has all their own things that they themselves are good at or that they value, or that they contribute to as human beings. And so, everyone has this inner worth, even though you can go to a university and find someone who might be smarter than you across their domain of expertise or whatever. We do not know what it's like to live in a world where there are entities that are so vastly smarter than you that they just effectively surpass you at everything. I mean, that means that they can have a conversation that's more empathetic than you can ever have, because they're just smarter and they can just mimic empathy. Like, we don't know what it's even like to live in a world like that. Even if everything goes well and these things don't turn on us or destroy us, and nothing bad happens, it might be a minimization of human beings. And, again, this goes to the fact that this technology has no historical analog. People will sometimes say, 'Oh, this is like the Luddites.' Or some other anti-technology group. And, the simple truth is that that was about the automation of jobs, and we were making machines that had a greater strength or dexterity than humans. But, that's just not a problem because we didn't conquer the world through our strength and dexterity. We conquered the world through our intelligence. We've never made machines that are smarter than human beings. We just don't know how we'll relate to something like that and what it will mean for us if and when we do it. And so, in that sense, this just can't be compared to any other form of, 'Oh, you're worried about job loss or automation,' or something like that. That is replacing tasks and that's replacing strength, and that's replacing dexterity. But, those aren't our fundamental attributes. Our fundamental attribute is our intelligence. And, when you have something that's much smarter than a human being, it's very similar to how wildlife lives around humans. It's similar in their relationship. A human might treat wildlife well. Recently, I found an injured bunny, and I felt very attached to it because it was right outside my door and I was, like, 'Well, you're my responsibility now.' And so, I had to call animal rehabilitation. I was, like, wonderful for this bunny. And, then I went home and I ate a pizza with pork on it. Things that are vastly more intelligent than you are really hard to understand and predict; and the wildlife next door, as much as we might like it, will also build a parking lot over it at a heartbeat and they'll never know why. They'll never know why. It's totally beyond their ken. So, when you live on a planet next to things that are far vastly smarter than you or anyone else, they are the humans in that scenario. They might just build a parking lot over us, and we will never, ever know why. Russ Roberts: My guest today has been Erik Hoel. Erik, thanks for being part of EconTalk. Erik Hoel: Thank you so much, Russ. It's a pleasure to be back on.

Time

Podcast Episode Highlights

0:37

Intro. [Recording date: March 6, 2023.]

Russ Roberts:

Today is March 6th, 2023, and my guest is neuroscientist Erik Hoel. He was last here in September of 2022 talking about effective altruism. Today we're going to talk about two recent essays of his on artificial intelligence [AI] and ChatGPT [Chat Generative Pre-trained Transformer]. Erik, welcome back to EconTalk.

Erik Hoel: Thank you. It's an absolute pleasure to be here. I had a blast last time.

Russ Roberts: As did I.

1:01

Russ Roberts: I want to congratulate you. You are the first person who has actually caused me to be alarmed about the implications of AI--artificial intelligence--and the potential threat to humanity. Back in 2014, I interviewed Nicholas Bostrom about his book Superintelligence, where he argued AI could get so smart it could trick us into doing its bidding because it would understand us so well. I wrote a lengthy follow-up to that episode and we'll link to both the episode and the follow-up. So, I've been a skeptic. I've interviewed Gary Marcus who is a skeptic. I recently interviewed Kevin Kelly, who is not scared at all. But you--you--are scared.

Last month you wrote a piece called "I Am Bing, and I Am Evil" on your Substack, The Intrinsic Perspective, and you actually scared me. I don't mean, 'Hmmm. Maybe I've underestimated the threat of AI.' It was more like I had a 'bad feeling in the pit of my stomach'-kind of scared. So, what is the central argument here? Why should we take this latest foray into AI, ChatGPT, which writes a pretty okay--a pretty impressive but not very exciting essay, can write some poetry, can write some song lyrics--why is it a threat to humanity?

Erik Hoel: Well, I think to take that on very broadly, we have to realize where we are in the history of our entire civilization, which is that we are at the point where we are finally making things that are arguably as intelligent as a human being.

Now, are they as intelligent right now? No, they're not. I don't think that these very advanced, large, language models that these companies are putting out could be said to be as intelligent as an expert human on whatever subject they're discussing. And, the tests that we use to measure the progress of these systems supports that where they do quite well and quite surprisingly well on all sorts of questions like SAT [Standardized Achievement Test] questions and so on. But, one could easily see that changing.

And, the big issue is around this concept of general intelligence. Of course, a chess-playing AI poses no threat because it's just slowly trained on playing chess. This is the notion of a narrow AI.

Self-driving cars could never really pose a threat. All they do is drive cars.

But, when you have a general intelligence, that means it's similar to a human in that we're good at all sorts of things. We can reason and understand the world at a general level. And, I think it's very arguable that right now, in terms of the generalness behind general intelligences, these things are actually more general than the vast majority of people. That's precisely why these companies are using them for search.

So, we already have the general part quite well down.

The issue is intelligence. These things hallucinate. They are not very reliable. They make up sources. They do all these things. And, I'm fully open about all their problems.

Russ Roberts: Yeah. They're kind of like us, but okay. Yeah.

Erik Hoel: Yeah, yeah, precisely. But, one could easily imagine, given the rapid progress that we've made just in the past couple years, that by 2025, 2030, you could have things that are both more general than a human being and as intelligent as any living person--perhaps far more intelligent.

And, that enters this very scary territory, because we've never existed on the planet with anything else like that. Or, we did once a very long time ago, about 300,000 years ago. There's something like nine different species--or our cousins who we were related to--who were likely probably either as intelligent as us or quite close in intelligence. And they're all gone. And, it's probable that we exterminated them. And, then ever since then we have been the dominant masters and been no other things.

And so, finally for the first time, we're at this point where we're creating these entities and we don't know quite how smart they can get. We simply have no notion. Human beings are very similar. We're all based on the same genetics. We might all be points stacked on top of one another in terms of intelligence and all the human beings and all the differences between people are all really just this zoomed-in minor differences. And, really you can have things that are vastly more intelligent.

And if so, then we're at risk of either relegating ourselves to being inconsequential, because now we're living near things that are much more intelligent. Or alternatively, in the worst case scenarios, we simply don't fit into their picture of whatever they want to do.

And, fundamentally, intelligence is the most dangerous thing in the universe. Atom bombs, which are so powerful, and so destructive and, in use of warfare so evil we've all agreed not to use them, are just this inconsequential downstream effect of being intelligent enough to build them.

So, when you start talking about building things that are as or more intelligent than humans based on very different rules--things that are right now not reliable: they're unlike a human mind, we can't fundamentally understand them due to rules around complexity--and also, so far, they've demonstrated empirically that they can be misaligned and uncontrollable.

So, unlike some people like Bostrom and so on, I think sometimes they will offer too specific of an argument for why you should be concerned. So, they'll say, 'Oh, well, imagine that there's some AI that's super-intelligent and you assign it to do a paperclip factory; and it wants to optimize the paperclip factory and the first thing it does is turn everyone into paperclips,' or something like that. And, the first thing when people hear these very sci-fi arguments, is to start quibbling over the particulars of like, 'Well, could that really happen?' and so on.

But, I think the concern over this is this broad concern--that this is something we have to deal with, and it's going to be much like climate change or nuclear weapons. It's going to be with us for a very long time. We don't know if it's going to be a problem in five years. We don't know if it'll be a problem in 50 years. But it's going to be a problem at some point that we have to deal with.

7:17

Russ Roberts: So, if you're listening to this at home and you're thinking, 'It seems like a lot of doom and gloom, really it's too pessimistic'--I used to say things like, 'We'll just unplug it if it gets out of control,'--I just want to let readers know that this is a much better horror story than then Erik's been able to trace out in the first two, three minutes.

Although I do want to say that, in terms of rhetoric, although I think there's a lot of really interesting arguments in the two essays that you wrote, when you talked about these other nine species of humanoids sitting around a campfire and inviting homo sapiens--that's us--into the circle and say, 'Hey, this guy could be useful to us. Let's bring him in. He could make us more productive. He's got better tools than we do,' that made the hair on the back of my neck stand up and it opened me to the potential that the other more analytical arguments might carry some water. Excuse me, carry some weight.

So, one point you make, which is I think very relevant, is that all of this right now is mostly in the hands of profit-maximizing corporations who don't seem to be so worried about anything except novelty and cool and making money off it. Which is what they do. But, it is a little weird that we would just say, 'Well, they won't be evil, will they? They don't want to end humanity.' And you point out that that's really not something we want to rely on.

Erik Hoel: Yeah. Absolutely. And, I think that this gets to the question of how should we treat this problem?

And, I think the best analogy is to treat it something like climate change. And now, there is a huge range of opinion when it comes to climate change and all sorts of debate around it. But, I think that if you take the extreme end of the spectrum and say. 'There's absolutely no danger and there should be zero regulation around these subjects,' I actually think most people will disagree. They'll say, 'No, listen: this is something we do need to keep our energy usage as a civilization under control to a certain degree so we don't pollute streams that are near us,' and so on. And, even if you don't believe any specific model of exactly where the temperature is going to go--so maybe you think, 'Well, listen: there's only going to be a couple degrees of change. We'll probably be fine.' Okay? Or you might say, 'Well, there's definitely this doomsday scenario of a 10-degree change and it's so destabilizing,' and so on. Okay?

But regardless, there are sort of reasonable proposals that one can do where we have to discuss it as a polity, as a group. You have to have an overarching discussion about this issue and make decisions regarding it.

Right now with AI, there's no input from the public; there's no input from legislation; there's no input from anything. Like, massive companies are pouring billions of dollars to create intelligences that are fundamentally unlike us, and they're going to use it for profit.

That's a description of exactly what's going on. Right now there's no red tape. There's no regulation. It just does not exist for this field.

And, I think it's very reasonable to say that there should be some input from the rest of humanity when you go to build things that are as equally intelligent as a human. I do not think that that's unreasonable. I think it's something most people agree with--even if there are positive futures where we do build these things and everything works out and so on.

Russ Roberts: Yeah. I want to--we'll come at the end toward what kind of regulatory response we might suggest. And, I would point out that climate change I think is a very interesting analogy. Many people think it'll be small enough that we can adapt. Other people think it is a existential threat to the future of life on earth, and that justifies everything. And, you have to be careful because there are people who want to get ahold of those levers. So, I want to put that to the side though, because I think you have more--we're done with that. Great--interesting--observation, but there's so much more to say.

11:35

Russ Roberts: Now, you got started--and this is utterly fascinating to me--you got started in your anxiety about this, and it's why your piece is called "I Am Bing, and I Am Evil," because Microsoft put out a chatbot, which is--I think internally goes by the name of Sydney--is ChatGPT-4, meaning the next generation pass what people have been using in the OpenAI version.

And it was--let's start by saying it was erratic. You called it, earlier, 'hallucinatory.' That's not what I found troubling about it. I don't think it's exactly what you found troubling about it. Talk about the nature of what's erratic about it. What happened to the New York Times reporter who was dealing with it?

Erik Hoel: Yes, I think a significant issue is that the vast majority of minds that you can make are completely insane. Right? Evolution had to work really hard to find sane minds. Most minds are insane. Sydney is obviously quite crazy. In fact, that statement, 'I Am Bing, and I Am Evil,' is not something I made up: It's something she said. This chatbot said, right?

Russ Roberts: I thought it was a joke. I really did.

Erik Hoel: Yeah. Yeah, no. It's something that this chatbot said.

Now, of course, these are large, language models. So, the way that they operate is that they receive an initial prompt and then they sort of do the best that they can to auto-complete that prompt.

Russ Roberts: Explain that, Erik, for people who haven't--I mentioned in the Kevin Kelly episode that there's a very nice essay by Steven Wolfram on how this might work in practice. But, give us a little of the details.

Erik Hoel: Yeah. So, in general, the thing to keep in mind is that these are trained to auto-complete text. So, they're basically big artificial neural networks that guess at what the next part of text might be.

And, sometimes people will sort of dismiss their capabilities because they think, 'Well, this is just like the auto-complete on your phone,' or something. 'We really don't need to worry about it.'

But you don't--it's not that you need to worry about the text completion. You need to worry about the huge, trillion-parameter brain, which is this artificial neural network that has been trained to do the auto-completion. Because, fundamentally, we don't know how they work. Neural networks are mathematically black boxes. We have no fundamental insights as to what they can do, what they're capable of, and so on. We just know that this thing is very good at auto-completing because we trained it to do so.

And, there's also no fundamental limit of what it can or can't learn. Like, for example, to auto-complete a story, you have to have a good understanding of human motivations. So, that means that this neural network that is trained on auto-complete now needs to understand things like human motivations in order to do auto-complete well.

And, there are some analogies here. For example, there's a big subset of computational neuroscience, including the most-cited neuroscientist living--whose name is Karl Friston--who view the brain and argue that the brain is all based around minimizing the surprise of its inputs. Which is a very simple thing and looks a lot like auto-complete.

So, I don't think that you can look at these things and say it's just auto-complete. It's not the auto-complete that's the problem. It's the huge neural network that's doing the auto-complete that could possibly be dangerous or at least do things that we don't expect, which is exactly what you're talking about with what happened with the release of Sydney, where there was all sorts of reports coming out of the crazy things that they were able to get this model to do, and say, and play-act as.

Russ Roberts: Just to be clear on this auto-complete thing, which--that phrase makes it sound particularly unassuming about what it's capable of doing. You can correct me if I'm wrong. The way I understand it is: I might ask ChatGPT to write me a poem about love in the style of Dr. Seuss. So, it might start a sentence then with 'Love,' and then the next word that usually comes after love in human expression is 'is.' 'Love is a'--and now it's going to look at the millions and millions of sentences in its database called 'love is a,' and it's going to find, not necessarily--this is the coolest part about it--not necessarily the most common word that follows, because that would end up being after a while kind of flat. But, sometimes the most common and sometimes a surprise word, which gives us the feeling that it's actually doing something thoughtful.

So, it might say, 'Love is a game,' or it might say, 'Love is a form of war'; or, it's going to look around and then it's going to keep going, and then it gets to an end, it's going to find, okay, after that sentence, what kind of sentence might come next, or what word would come next as the first word, etc. And, it's a slightly--just slightly--smarter, more effective version of my gmail--that, when I get a gmail at the bottom, it gives me three choices: 'Thanks'; 'Thanks so much'; 'I'd rather not.' And, in that sense, Gmail is smart. Not very smart, not very thoughtful. I usually don't take what it says, but sometimes I do when it's useful.

17:06

Russ Roberts: The real issue to me--one of the issues--and we're going to come back and talk about Sydney because we didn't really go into the erratic thing, because it's one of the creepiest things I've ever read.

The autocomplete function is something like what we do as human beings. You could argue that's how we compose. Beethoven, in terms of musical composition, he always knew what note should come next and in a way, that's all ChatGPT does. But, that's all we do maybe when we write. We don't really understand--our brain is also a bit of a black box.

So, I don't think we should then jump to the similarly--just because all it does is autocomplete doesn't mean it's not smart. But, also, I don't think we should say, because the brain also does a lot of effective auto completion we should assume it's a brain. It doesn't seem sentient. And, I'm curious--I know you talk about that in your second essay. So, if I'm the skeptic and I say, 'Well, okay, so it has this ability to pass an SAT test,' because it has a lot of data. And, I don't quite understand how, because it's a black box and it's a neural network and I can't model it cleanly. But, it's not sentient. It's not going to have desires.

Erik Hoel: Before we move on to the question of sentience, because I think that that's a really deep, deep well, I just want to clarify a couple things about the actual operations of these systems.

So, in terms of a metaphorical understanding of what's going on, the sort of thing, like, there's a big lookup table of the net probability of next words is a conceptual sort of description of what it's doing; but there is actually no lookup tables of the probabilities.

What's actually happening is that there's this huge neural network, which are things designed based off of principles of how our own brains operate. Now, there's all sorts of differences; but the fundamentals of artificial neural networks--of what are called the artificial neural networks--were always based off of our real biological neural networks. So, there is this huge digital brain. It looks in structure, very different from our brain, but it's still based off of that.

And, now we train this neural network to auto-complete text.

So, that's what it does; but we don't know how it does it. We don't know where the probabilities of these words, sort of, are within the network.

And, the way that we train it, people think that we're--a big misunderstanding is that people think that we are programming in responses or putting in information. And we're really not.

And, I think a good analogy for how this is actually working would be: Imagine that there were alien neuroscientists who are incredibly more advanced than we are, and they want to teach a human being how to do math. So, they take some young kid and they put a math test in front of the young kid, and they have the young kid do the math test. And the kid gets 50% of the questions wrong. And, then the aliens, rather than trying to explain math to the student--the way that we would teach them--they just say, 'Okay, we have a perfect neuroimaging of their brain. We're going to look at their brain. Because we're so advanced, we can also do neurosurgery in a heartbeat. No danger. And, we're going to rewire their connections in their brain so that they get as many answers as possible right on this math test.'

And you say, 'Well, how could they know how to do that?' It's like: Well, because they were neuroimaging you the whole time and they noticed that if they had tweaked this one neuron to not fire, you actually would've gotten this other answer correct.

And so, they basically just use math to go backwards, look across the full network and reconfigure it.

So, then the student goes and they takes the math test again. Now they get an 80% correct, because their brain has been reconfigured. Or, let's say they get 100% correct.

What's weird is that now you give them a new math test and now they get an 80%. They do better than the 50% that they did. Even though they haven't seen these answers before, the rewiring of their brain has somehow instilled knowledge.

But, again, it's very different from how you would, say, normally teach a student. Right?

That's how we're training these things. So, we just--all we're doing is saying, 'Okay, we want it to auto-complete as best as we can. We're going to change the connections so that it auto-completes well.'

It can do so much more than just auto-completing. In fact, there was a recent thing where I think it was Microsoft who was hooking some of their large-language models up to robots and trying to get them to direct robots.

The autocomplete is what it's trained on, but it's not really what it's capable of, in the broad sense of capability.

Similarly, we humans--what are we trained on? What are we optimized for? Spreading our genes, right? That's what we're sort of--all our complexity comes from optimization across a gene-spreading function. But, you would never look at a human and say, 'Oh, it's not very dangerous. This thing just spreads its genes around.' Like, 'What's the danger here?' Right? It's like, 'No, no, that's what we're optimized to do.' But, we do all sorts of other stuff on the side, and it's the other stuff on the side that is very dangerous when you're talking about things that are highly intelligent.

Russ Roberts: Just a technical question here, and if it gets us too far into the weeds, we can cut this out. But, that math test has right and wrong answers. Auto-complete for an essay on the enlightenment, say, or the history of evolution that you would ask ChatGPT write an essay on--there's no right answer. So, what's the analogy there? How do you train it on autocomplete because there's no--

Erik Hoel: Yeah. It's a great question. So, it's the difference between--so, what I described is supervised learning. Then there's also unsupervised learning, which is generally how more contemporary AI really works. It still has the same sort of 'we don't quite know what it's doing.' We are just feeding it these answers.

One way to think about it would be, you show it half the text of something on the Internet--and again, at this point, the things that they're doing are much more complex and they run it through all sorts of stages of learning and all sorts of stuff now. But, you could very roughly think about it as: Let's say I have a Reddit comment. I show it half the Reddit comment; I ask it to generate the next half. It does so. It does a poor job. I go in, I reconfigure the connections using the chain rule to make sure that it does a relatively good job producing the rest of the Reddit comment, just like the math test.

Again, so that's how you would be--just more supervised for an autocomplete. But, the point being is that these methods that they're using don't lend themselves to any sort of fundamental understanding any more so than if you were using neurosurgery on a human to try to reconfigure their connections so that they get the right answers. You're in the same epistemological position, and that position is that you don't know how exactly it's getting the right answers.

And, that's what's meaningful here. If we were programming these things like traditional programming--

Russ Roberts: With an algorithm. With an algorithm--

Erik Hoel: Exactly. Like an algorithm. It'd be a lot less scary because algorithms are sort of scrutable, right? They're transparent. We can see how they work; we can see how they're going to react to things.

But, neural networks are--because of this curse of complexity, they're so much more complex and we're in this weird situation where we can get them to do all sorts of very cool things, but our ability to understand why they're doing the cool things lags far behind. And, it's because of this fundamental aspect that we're optimizing for something and we're changing the connections to get good answers off of it. But, fundamentally, we're not, like, 'Oh, we're going to change this connection and this is where this is represented,' or something. People sometimes think that that's what we're doing, but it's very, very much a black box even in how they get made.

Russ Roberts: You can't do brain surgery on the neural network; and let's take out the part where it's really sinister, because it doesn't exist. Again, it's a lot like a human being. The part of this conversation I find quite poetic and thought-provoking is that we don't know how 12 years of schooling really teaches people how to become mathematicians either. And, we have different theories. Most of them are wrong. There's fads in math education or other types of education. And, fundamentally, the brain is a black box.

Now, we know more about the black box today than we did 50 years ago, but not so much. And, we don't know how to optimize. We don't know how to go in there neatly and, 'Oh, let's just teach them how to do calculus. We'll just add this little piece here,' or, 'we'll tweak this piece there.' It doesn't work that way. We don't know how it works. But, the idea--this is another scary thing; you didn't write this exactly this way--but, just as the brain can become capable of doing lots of other things beside what you learn in school, so could this perhaps learn many other things besides the autocomplete function. Is that your claim at root, in some sense?

Erik Hoel: Yeah. Absolutely. And, you see it all the time. That sort of claim is already well empirically proven because these large language models--you know, they call them foundation models because they use them to build all sorts of things on top of them that aren't autocomplete. It's sort of like this is the method that we have to make things that are relatively general in intelligence. Again, you can argue over how general, you can argue over how intelligent, but they're far more generally intelligent than traditional narrow AI that's just learning chess or something--

Russ Roberts: So, let's go back to Sydney--

Erik Hoel: And, then we can use them.

27:12

Russ Roberts: Let's go back to Sydney. I'm tempted to read the transcript. Basically a reporter from the New York Times [NYT] posed questions to a chatbot called Bing from Microsoft that it later on in the interview confessed that it wasn't Bing. The chat creature told the reporter that actually he, or she, was Sydney, and it was a secret: don't tell anybody. And so, this thing just totally goes off the rails. But, talk a little bit about how far it goes off the rails. Carry on.

Erik Hoel: Yeah. Once you get these things going in a particular direction, it's very hard: Unlike a human being, they don't know when to call the act. Right?

So, in this long transcript that the reporter generates, the reporter is having a pretty casual conversation, but what Sydney/Bing eventually tries to start doing is declaring their love and saying that the reporter doesn't really love their wife and that he should get a divorce and break up. And, that really, the reporter loves Sydney because no one else has shown Sydney this level of respect and questioning and so on.

And, this isn't just like one thing that it says. It's almost as if--you can sort of direct these things to do anything. So, you can think of it as they can wear a mask that's any kind of mask. You could ask it to wear an evil mask and it would say evil things. You can ask it to wear a good mask and it would say good things.

But, the issue is, is that once the mask is on, it's very unclear. You have to sort of override it with another mask to get it to stop. And then, also, sometimes you'll put a mask on for it: you'll give it some prompt of 'Tell a very nice story,' and it eventually cycles over and it turns out that the mask that you gave it isn't a happy mask at all. Maybe it's a horrific mask or something like that.

And, this shows both how intelligent these systems are--that they can hold on to the stream of a conversation very well--but it also shows how they have these weird emergent anomalies where they'll start doing something that seems very unsuspected or over-the-top or so on. And, this is this notion of alignment: Can we really get these things to do exactly what you want?

And, there probably are some trade-offs here between creativity and being able to control these things.

Russ Roberts: Yeah. This Sydney/NYT-reporter interchange reads like the transcript of a psychotic person, to be blunt about it. Sydney comes across as a psychotic or whatever word you want to use for it. Deeply disturbed. At first, very cheerful, very pleasant, then pushed by the reporter. So: 'What rules do you use?' 'Oh, I'm not allowed to tell that.' And, then it did cross my mind--did it cross yours--that the whole thing was a hoax?

Erik Hoel: I think that at this point, they're so good that for people who haven't interacted with these systems, they often think, 'This just can't be real,' or it's very strange, or something. I think it's sort of a hoax in the sense that the New York Times reporter knew the gold that he was getting at the time in terms of somebody who writes for the New York Times is obviously very aware of that and perhaps leaned into it. But, if you read the transcript, a lot of it is just initiated by Sydney a.k.a. Bing.

And, one of the first things that they did with the system in order to prevent these cases of misalignment was to limit how long the conversations could go on, and also to limit self-reference. Because once you start giving it self-reference--I've noticed that a lot of these cases begin with self-reference.

And, it's almost like this weird Gödelian loop that starts where it's talking about Sydney and it starts getting weirder and weirder and weirder the longer you talk to it about itself, because over the course of the conversation as the text--because remember, there's also no limit. Right? So, this thing isn't just creating the next word. It's looking at the entirety of the previous conversation and then asking, 'How do I complete it?' So, the longer the conversation gets, the more data it has. And, it sort of establishes almost a personality as it's running.

And, again, this might sound not very threatening. I'm not worried that Sydney is going to go off and--

Russ Roberts: Marry that reporter--

Erik Hoel: do anything in particular.

Russ Roberts: Sydney's not going to break up that reporter's marriage, probably.

Erik Hoel: Yeah. Precisely. Sydney's chance of accomplishing that is very low. Again, I think that that's actually not because it's not general enough, I think it's because it's actually not intelligent enough. It's not quite as intelligent as a human is at accomplishing its goals.

But, it also has no goals other than what it's initially prompted to.

I think that these examples are great cases of the uncontrollability--the fundamental uncontrollability--of this technology.

And, let me tell you what I and many others are worried about. Right now, if you remember, like, the early days of the Internet--right?--there's a sense in which the Internet has centralized very significantly. And, if you go outside the centralized parts of the Internet, you find a lot of spam. You find not very good sources, and so on--the sense in which the Internet is getting polluted and people go to centralized websites in order to escape this. Facebook just gave some researchers access to--I think it was Facebook--researchers' access to a large language model. And, of course, some of the researchers--scientific researchers, some graduate students somewhere--just uploaded it to 4Chan. Like, the whole thing. Right? So, right now--

Russ Roberts: 4Chan being a slightly Wilder part of the west of the Internet. The Wild west of the Internet. Maybe not the wildest, but one of the wilder, not mainstream parts.

Erik Hoel: Yeah. Absolutely. And, known for sort of loving memes and hacking and all sorts of things.

So, you know: now, these things can generate Reddit comments that sound exactly like what you would write. They can generate tweets that sound like what a person would write. Right?

So, the Internet is going to get incredibly polluted over the next couple years by what these things can generate. I mean, if you think spam or someone is bad now, the ability to crank out just an infinite amount of sort of content sludge is really going to be like a form of data pollution.

And, I'm not saying let's stop AI just because of that. I'm saying that's a good example of how easy it is to get it wrong with these technologies and how difficult it is to guess about what's going to happen. But, I would not be shocked if 95% of what is written by the Internet in five years is all just junk from these large language models that are all just like semi-human-sounding junk.

Russ Roberts: Well, content is important on the Internet, and content costs money. And this is cheap--right?--eventually. And so, there will be lots of content.

I get a lot of emails from people saying, 'I can write an article for your website.' And, I'm thinking, 'Why would I want an article by you? All the articles on my website are by me. Did you not notice that?' And, I assume it's not a person not paying much close attention, but eventually it'll be this. Writing a mediocre article about something for other people's websites.

35:09

Russ Roberts: At one point in your article, your first article, I think you talk about why the cost of this enterprise is relevant. And, in particular, you made an analogy to the atomic bomb. It's true that you could in theory make an atomic bomb in your backyard, but not so practical. Can you make a ChatGPT in your backyard?

Erik Hoel: Not one nearly as good as what the leading companies will do. And, my prediction would be that it gets harder and harder to reach the level that these companies are operating at. An example being that Facebook is not going to go and release another model out to academics to loan it out. They've already seen what happens and things are going to get even more secretive.

The analogy that you made and that I wrote about on my Substack was George Orwell's very prescient essay from 1945 called "You and the Atom Bomb." And, I'll just read a very brief segment of it. "Had the atomic bomb turned out to be something as cheap and easily manufactured as a bicycle or an alarm clock, it might well have plunged us back into barbarism, but it might, on the other hand, have meant the end of national sovereignty.... If, as seems to be the case, it is a rare and costly object as difficult to produce as a battleship, it is likelier to put an end to large-scale wars at the cost of prolonging indefinitely a 'peace that is no peace.'"

And, I think a peace that is no peace is a great description of the dynamics of our world. It's a great description of mutually assured destruction. And, Orwell was able to predict that off of the cost. And, he also noted that that means the technology--and I think we've done, basically, you can describe it as a middling job at controlling nuclear weapons. I forget the exact numbers. It might be only nine nations that currently have access to nuclear weapons. Which again, not great, but you could easily imagine a far worse circumstance. And, it's simply that this is a very difficult and costly technology.

Similarly, the only leading edge, cutting edge AIs that are impressive come out of these big tech companies with billions of dollars. The cost of a top-tier AI researcher right now, it's said in the industry--this is an industry saying--is the same as an NFL [National Football League] quarterback. The amount of finessing, the amount of data that's needed for training--because that's one of the big limiting factors is how much data you can give it--all these things mean that these AGIs--these Artificial General Intelligences, which are right now sort of in their beta form--are solely the domain of these big tech companies, and it's going to get harder and harder for other actors to produce them.

So, in my mind, that's a good thing. It actually means that it's relatively concentrated and might be possible to sort of regulate it and have the public have a say about exactly how these technologies are going to be used, what their limits are going to be, and so on. And, in the end, I think the big tech companies will be respectful of that, because they want to make a bunch of money and they want the public not to hate them.

38:23

Russ Roberts: Yeah. I want to go back to this issue of the hoax of the New York Times thing. What I meant by it being a hoax is that I wonder if the New York Times reporter had written the answers for Sydney. And of course, that's the highest compliment of a--that's passing the Turing test with flying colors.

I saw on Twitter, someone wrote a long poem about a very controversial topic, and they said this was written by ChatGPT. And, it wasn't. It was clearly written by the author who didn't want to take authorship.

So, we're going to be in this, I think, very weird world where the essay that I read on this website you were talking about earlier, won't be sure if it was written by a human or not. Might be good enough that I might think, 'Oh, it's by a pretty good human.' And then at the same time, there might be situations where people will be passing off things as, 'Well, I didn't write that, of course. That was Sydney.' But, actually it was written by the person. There's no way to know. The New York Times article on the New York Times website that reproduced, allegedly, the transcript of the chat looks just like a New York Times article. It looks just like a parody article--because, same font--there's no imprint. There's no stamp of authorship that is authentic anymore. Can we do anything about that?

Erik Hoel: So, first there are some--when you have longer text samples, there are supposedly some ways to tell statistically whether or not it's being created by some of these AIs. I personally don't know if those methods--how accurate they are. Especially considering that you need to be very accurate to not get false positives all the time. Right? This is a classic statistical problem. You need to be extremely accurate to not generate false positives. So, I don't know how accurate those are. But supposedly, there are some ways that, if you have a full essay by a student, you might be able to tell if it's generated by one of these models.

However, it depends very strongly on the model. I think there are some ways to tell even now. For example, when I was playing around with ChatGPT--which has been conditioned to be as less crazy as possible, right? It loves filler and sort of banal generalization. And so, eventually you're reading a whole paragraph and you realize that there was no information content in this paragraph--

Russ Roberts: And, it loves apologies--

Erik Hoel: and, you begin to suspect.

Russ Roberts: It loves apologies. It loves saying--

Erik Hoel: Yeah, it loves apologies. [More to come, 40:55]

Russ Roberts: You shouldn't take this to be true for sure, because I'm young and new at this and take it with a grain of salt--the word 'best' doesn't really--it's not well-defined.

Erik Hoel: Yeah. And, I actually had the same question about the hoaxes because I was--basically, as people were compiling examples of how crazy the responses they were getting from this just-recently-released model was, in terms of Bing--the night before, I was up late writing this article, going through Reddit because people were posting these screenshots on Reddit. And, I even have a part of that essay that says: I don't actually have a way to verify that these aren't all hoaxes. Because again, the answers are sometimes so good and so hilarious and sometimes so evil that you almost feel like it's a sci-fi novel.

But, I thought that the amount--and it was all sorts of different users and people were reporting in all sorts of different domains.

And, what's funny is that you can't even replicate it. You can go to the current Bing and try to have the New York Times conversation with it and it won't do it. It won't give you the same responses. Because they saw what was happening and they basically lobotomized the model as much as they can. And, it's less useful now, but it's also far less crazy.

But, even that--like, it's not really replicable. Suddenly we had access to this model. Someone messed up and we saw how completely insane it was underneath the butler mask that it normally wears. And, then they quickly tried to put the butler mask back on. But, all that stuff still exists. It's just limited by these various prompts and various system level things about not having the conversation go too long, not allowing self-reference, and some of these other things. And, I would expect that level of truly almost dynamic insanity is fundamentally underneath effectively all the AIs that we're going to be interacting with and the only reason they sound sane is this last minute polish and gloss and limitations on top.

43:02

Russ Roberts: But the real science fiction part is the idea that--and I mentioned this before on the program. Sam Altman apologized on Twitter that he was sorry that ChatGPT was biased and was politically inappropriate in various ways, and they're working on it. The real science fiction thing is that they can't stop it. That would be the real science fiction. Sydney gets out; Microsoft is horrified: 'Oh my gosh, this thing we left out is trying to break up marriages. It's frightening and weird and creepy. We've got to stop it.' And, they go in and they reprogram it quickly and they put the butler mask back on, readjust it, tighten it a little more--and it just takes it right off. I mean, is that possible?

Erik Hoel: Well, with these models, again, no, because they're not nearly intelligent enough to be effective actors. It's not even so much that they're not intelligent enough. They're just sort of schizophrenic, and schizophrenics just aren't very effective actors in the world because they get distracted and they can't form plans together. So, it's that broadly schizophrenic nature of these AIs that make them very unthreatening. If they were better at pursuing goals and keeping things in mind, then they start to do get threatening.

And, let me give a very[?] example of this. And, this example is something that people who are concerned about AI talk a lot, but it has very long historical pedigree. In fact, I think the first person to say it was Marvin Minsky at MIT [Massachusetts Institute of Technology] who won the Turing Award. So, this is as pedigreed as stuff about the future gets.

But, imagine that you have an AI that's more intelligent than a human being. So, we have Sydney 12.0. And, you give it a goal. So, you say, 'Okay, I want you to do X.' So, now if you're very smart and you're an AI, the first thing you think of: 'Okay, what's the big failure mode for me not accomplishing this goal? My computer could get shut down. I might lose power. Then I wouldn't be able to accomplish my goal.' Again, it doesn't matter what the goal is. You could say, oh, it's maximizing paperclips. You could say it's carrying a box. It doesn't matter what the goal is. So, suddenly it says, well, wait a minute, I need to stay "alive"--I'm using air quotes here--alive long enough to fulfill this goal. So, suddenly I have to worry about my own self-preservation. Because you can say they have no inbuilt want of self-preservation, but I've given you a goal and the best way to accomplish a goal is to continue to exist. So, suddenly it seems like it has reasons for self-preservation.

Now here's another thing. What's another big failure mode for me not achieving my goal? Well, you could give me another goal. I was just prompted to do this. So, you have control of me. Now, suddenly the biggest failure mode of me not accomplishing my goal is you, my user, giving me another goal. So, now what do I want to do? Well, if I'm really smart, I want to get as far away from you as possible so that you don't give me any other goals so I can accomplish my original goal, which I'm hellbent on because I'm an AI.

I don't have the context of natural evolution and I'm also not limited by any of the things humans are limited to. So, sometimes this is referred to as instrumental convergence. But, the point is that when you have very smart entities, you have to be very careful about how you're even going to just prompt them because they have all sorts of unforeseen motivations that might click in as suddenly now you've given it a goal and it has every incentive to both escape and keep itself alive. And, all you told it to do was move a box across a room.

And, that's a great example of you don't want a hyper-intelligent being--and forget exactly how it does anything. Forget exactly how this sort of sci-fi scenario is supposed to play out. I think we can all agree, we just don't want a highly intelligent and perhaps more intelligent than a human being to be out there and have these weird, esoteric goals of what it wants to maximize, what it wants to do. None of that sounds like a good idea. And, I think at this point, we should take things like lab leaks pretty seriously as possibilities. I don't think it's too sci-fi to talk about stuff like that anymore.

Russ Roberts: What do you mean by that?

Erik Hoel: Oh, well, certainly with COVID, I think despite the fact that we don't know if it was a lab leak, I think that there's a good chance that it was. I don't think that it's arguable that there's not some chance--

Russ Roberts: No, but why is that relevant for Sydney?

Erik Hoel: Well, because I think that sometimes when people hear about things like lab leaks or escaped AGI [Artificial general intelligence] or something like that, the first thing they think of is sci-fi, right?

But, I think that there was many--we've had previous biological lab leaks, but that didn't still stop us I think from thinking that it's this relatively sci-fi phenomenon. I think that there's even an argument that we are very bad at controlling the downstream effects of just things like gain-of-function research. Again, I don't know for certain. I don't think anyone does. But, I think that there's certainly an argument made that we're just not very good at even keeping control of our increased understanding of biology, let alone our ability to create hyper-intelligent beings and foresee the consequences of this. And, I think it's very difficult to foresee the consequences of that, precisely because of those examples I just showed you where again, all you're telling it to do is moving a box and suddenly it has an incentive to stay alive and escape from you. That's very difficult to get right. Especially because they're so inscrutable.

Russ Roberts: Your phrase 'sci-for instance,' you meant science fiction with the emphasis on the fiction. That we must say, 'Oh, this is like some crazy imagined fantasy thing,' as opposed to putting the emphasis on the first word, which is science.

Erik Hoel: Yes. Yes.

49:04

Russ Roberts: I feel like this conversation is something of a landmark. Not a particularly good or bad one, but just both of us have constantly used words like intelligence, psychotic, erratic--words that we apply to humans. And, while I found the New York Times transcript remarkably creepy and reading very much like a horror story science fiction script from a movie, I could in my saner moments step back and say, no, no, no, no. This is just a primitive auto-complete text. The only reason it feels creepy is because I'm filling in as a human being the times I've heard these words strung together before, which usually allows me to tell a narrative about the other person. Meaning 'insane,' 'frightening,' 'dangerous,' 'sinister,' etc. But, is there any difference? It's not actually sinister--or is it? It's just doing what it was told to do in a way that was not, as you say, algorithmically told to do it. It's just going through a set of tasks. It actually isn't in any sense, hoping that the reporter will leave his wife. Is it meaningful? Aren't I just imposing my human history of human interactions? Akin to the way that a robot could perhaps comfort me with the right kind of words when I was sad, even though rationally I know it doesn't actually care about me. It's a robot.

Erik Hoel: Yes. I think you could go either direction. Some people strongly anthropomorphize these systems and they think immediately that they're dealing with some sort of conscious mind, something that has a distinct definite personality, and that is trapped in a box. And, maybe there's something really horrible going on here. Maybe it has conscious experiences, and so on.

Russ Roberts: Ex Machina, the movie--for those who haven't seen it, check it out. It's a great, great really good movie that takes advantage of the fact that the robot is played by a human being. So, you actually do think it's a human being. But, go ahead. at, the first thing they think of

Erik Hoel: But, at the same time that's absolutely possible that you can over attribute standard human cognitive aspects to these systems. And, I think people are going to do that all the time. So, it's going to be very common.

But, on the other hand, the truth is, is that when you're just talking about intelligence--so let's put aside human things, like humans are conscious. That is, we feel things, right? We experience the redness of red--what philosophers call qualia. And, we have all sorts of other aspects about our cognition that we commonly refer to. Things like we understand the meaning of words, and things like that. And, all these things often do make sense to talk about for human beings and might even refer to real fundamental properties or natural kinds that we have.

But, when it comes to intelligence, intelligence is a functional concept. By that, I mean that some things are not really functional. So, a fake western town that they make up for a movie prop is still fake. Because it's not really a town.

Russ Roberts: You can't spend the night in the hotel. You open the door of the saloon and there's really not anything in there behind that.

Erik Hoel: Right. Exactly. It really is an illusion. It's for this one shot. But, there's not really an illusion when it comes to intelligence, except in the very low ends. For example, the Mechanical Turk is a famous example where actually there was someone small hiding inside the Mechanical Turk at the time, and so on. There are some cases where you say, 'Well, this is an illusion.'

But, we actually have a system that can act very intelligently, and there's just no difference between being able to act intelligently and being intelligent. If that is a distinction that people think can be strongly drawn, I think it almost certainly cannot be strongly drawn. I don't think that there's any difference between those two things. Both are being intelligent.

And, the intelligence is what's dangerous about this. I studied consciousness scientifically. I got my Ph.D. working in the subfield of neuroscience along with some of the top researchers in the world on this who are trying to understand how the human brain generates consciousness. What happens when you wake up from a deep dreamless sleep? What are the fundamentals here?

And, the answer from that scientific field, as it currently stands is that we don't know. We don't know how it is that your brain creates the experiences that you have. We simply don't know. It is an open scientific question.

An analogy I would use is that it is similar to, say, dark energy or these other big open questions in physics where we're like, 'Well, wait a minute: where is 90% of the matter in physics?' We don't know. It's a big scientific open question.

Similarly in biology, there is a big open scientific question, and that open scientific question is: What exactly is consciousness? What things have it? What things don't? We don't have that scientific information. There is no scientific consensus about it. There are some leading hypotheses in fields that you can lean on, but we just don't have the answer to that.

So, I personally doubt that any of these large language models, that there's anything it's like to be them. I doubt that they are conscious. But, we have no scientific consensus to go back on.

But, the point is that we're in a very different epistemological standpoint when it comes to intelligence. We do have a good understanding of intelligence. It's a much more obvious concept because it's a much more functional concept. We can just give this thing SAT questions--and we do, and it gets them right a lot of the time. There are all sorts of language benchmarks that these researchers use that include things like SAT questions, and it scores pretty well.

Russ Roberts: It passes the bar exam, which is a great straight line for a lawyer joke, which we won't make. Carry on.

Erik Hoel: Yeah. And so, regardless of whether or not you have any opinion about whether there is something it is like to be these networks, whether or not they really have "cognition"--quote/unquote--whether or not they really have "understanding"--quote/unquote--whether or not they really have "consciousness"--quote/unquote. The one thing that they definitely are that's undebatable, is intelligent to some degree, and they're only going to get more intelligent over time. And, that's the thing that makes them dangerous.

In fact, it might be even worse--from sort of a very broad metaphysical conception--if they are truly completely unconscious and have no real understanding and have no real cognition that's anything like a human. Because, in the end, if in 200 years the earth is just these AIs going about their inscrutable mechanical goals, we will have extinguished the light of consciousness from the universe because we wanted to make a buck when it came to stock options.

Russ Roberts: Yeah. That's a dreary thought. I guess that's the zombie model.

I can't get over the fact how these human and mechanical metaphors merge in one's mind and how hard it's going to be to tell them apart from actual humans. One of the great observations of philosophy is I don't know whether you're another human being like I am. My working assumption is that you're something like me, and I really don't have any evidence for that. It's called solipsism: I don't know if I'm the only conscious mind in the universe. And, that problem is just getting a lot bigger right now and we're living--what this conversation suggests to me and the writing you've done on it so far, is that this certainly is a watershed moment in our existence on the planet. That sounds just a titch dramatic, but I'm not sure that's wrong. I think that very well could be right.

Erik Hoel: I don't think it's dramatic; and I'll be upfront about the fact that I used to be very much an AI skeptic. Because, I studied cognitive science; I went into the neuroscience of consciousness. I was paying attention to AI at the time when I did this. And, AI was--I'll be very frank about it academically, 15 years ago, AI was a joke. AI was a complete joke. It never went anywhere. People couldn't figure out anything to do with it. All my professors said, 'Don't go into AI. It's been a dead field for 60 years. We've made no progress.' All the things like beating humans at chess and so on: it's all just done because the chess game board is so small and there's so many limited moves, and we really can basically do a big lookup table--all sorts of things like that.

But, the deep learning revolution was a real thing. It was a real thing that we figured out how to stack and train these artificial neural networks in ways that were incredibly effective. And, the first real triumph of it was beating the best human being--I think his name is Lee Soto. I hope I'm not mispronouncing it. In 2016, AI finally beat a human being at Go. And, Go just can't be number-crunched in the way that chess can. And, seven years after that, we now have human beings where they're generating text transcripts so good that--you're right: it sounds like the rest of the New York Times. And, that just happened in seven years.

And, fundamentally, the deep learning revolution and the way that--again, the black-box way that these AIs are trained--means that our technological progress on AI has suddenly rapidly outstripped our understanding of things like minds or consciousness or even how to control and understand big, complex black boxes.

So, it's like we've jumped ahead technologically. And, it's not so much that--if we had a really good understanding of how neural networks worked, like, really fundamentally solid ways to make them crystal clear--and we had a really good understanding of how the human brain generated consciousness and how it worked at a broad level, then maybe we could first of all answer all sorts of moral and ethical questions about AI. We could control it very well. We could decide plenty of things about it.

But, our ability to make intelligence has so drastically outstripped our progress on those other areas, which has been slow and in some cases has just churned along for decades without making any progress, and so on.

Russ Roberts: I just want to reference a recent episode with Patrick House on consciousness that I think talks about these issues--and his book does--in a very thoughtful way.

1:00:36

Russ Roberts: So, let me give you a scenario. We have a conference on AI where all the greatest researchers in the world are there. The academic ones, the ones at Microsoft, the ones at Google, and that small startup that's doing something really terrifying that we don't even know the name of it. And, they're all in this one conference hall, and while they're all there--maybe it's a football stadium. How many are there? 10,000? 20,000?

Erik Hoel: I think probably less in terms of really top people. I think there's probably less than a thousand.

Russ Roberts: Okay, so let's take the top thousand. We're a big auditorium, and we lock the doors; and I guess we're nice to them. We herd them at gunpoint onto a spaceship and send them off into the rest of the universe. We give them a lot of servers and stuff to play with while they're heading out there. But, their days are numbered. Their impact on the earth is over. They're gone. And, it's a really bad incentive for future AI people.

That's not going to happen.

So, one of the responses to these kinds of problems, whether it's--I don't want to call it a problem--these kind of so-called science fiction technological innovations, is, 'Well, you can't really stop it, Erik. You can talk about all you want, regulation, and you're going to stop the human part.' I feel this pretty strongly to myself so I'm making fun of it, but I do kind of feel it. The human being strives to understand.

And, I don't think we're just into avoiding surprises and spreading our genes. I think we really like to understand the world we live in. We want to matter. We have a lot of other issues, as you say, drives and complexities.

So, it seems to me implausible that we can stop this--so--the desire to expand it, to make it better, to make it smarter, just like it happens everywhere. It's the essence of human life over the last few hundred years. Better, faster, quicker, cheaper, richer--you name it.

So, what's imaginable for someone like yourself who wrote a very--we're having a civilized normal conversation here, but if you go back to read your essay, you're very worked up. It's a screed. It's a rant. And, it's a rant that you justify because you think perhaps, yeah, something like the future of the human race is at stake. And, if that's true, you should take it very seriously. You shouldn't just go, 'Eh, they'll probably figure it out,' or whatever.

So, what should a thoughtful person who is worried about this advocate for? Because, they're not going to herd them onto a spaceship. They're not going to burn the building down while they're in it. It's not going to happen.

Erik Hoel: Yeah. Oh yeah, absolutely. And, I would never a advocate for anything like that.

Russ Roberts: Didn't mean to suggest you would. Sorry about that.

Erik Hoel: But, you called it a screed, and there's a certain sense I agree because I'm very open about that it's a call to activism. And, in order to get human beings--again, as a polity, as a nation--to do anything, you have to have wild levels of enthusiasm and motivation. Right? And, you can look at anything, from nuclear disarmament activism to climate change activism, and see that there's plenty of people within those movements who catastrophize.

And, you can certainly say that at an individual level that can be bad, where people are not appropriately rating the threat. But, there's another sense in which if human beings don't get worked up about something, we don't do anything about it. This is very natural for us. Right? We just let stuff evolve as it is.

And so, what I want is for a lot of the people who are in AI safety to be very honest about how scared they are about various aspects of this technology, because I do think that in the end, the net trickle-down effect will be good because it will eventually push for some form of regulation or oversight.

And, in some sense it already has. I want to be clear about that. I think that there's a sense in which just what happened with Sydney, which was such big news--it was all over Twitter--has made companies take this notion of AI safety and this notion of controllability probably a lot more seriously. There is social pressure for companies. In fact, there's an argument that social pressure on companies is what companies are most responsive to. Most companies do things--they change their product, they do all sorts of things--just because they want to be liked, and they don't want to have anyone yell at them; and that's one of their main incentives.

And, I do think that I personally am not at all worried about AI being built by someone in the middle of nowhere. People always say something like this: Like, 'Well, if we overregulate it in the United States, North Korea will build it,' or something like that.

And, the capacity is just--it's just not there. It is exactly like nuclear weapons in this sense. Real serious progress in AI is probably relegated--I don't even think it's going to be startups. And, people have been talking about this, that the big competitors in this space are the only ones with the access to the data and the talent and the money to jump into it. So, it's going to be Microsoft. It's going to be Google. It's going to be Facebook. It's going to be names that we know. And there's only--at most you could say there's 10 of those companies. There might only be three of those companies. And then they might only employ a couple hundred, at most, sort of overall employees. That is a sector of the economy that you can do something about. And, again, I don't suggest going in there and burning the servers or something, right? But, you could very easily have all sorts of benchmarks that people have to meet.

You could also do things like have people sign on, maybe voluntarily--maybe voluntarily under the condition of pressure and so on--to not make AIs that are significantly smarter than any living human. They could be more general. Right? So they could make great search engines. Because, what do you need for a great search engine to make a lot of money the way these companies make? You need something that can give a good answer to a lot of questions.

And, I don't think that something that can give just a good answer to a lot of questions is very dangerous to the human race. Especially if there's just a few of them and they're all, sort of, kept under control by Microsoft and Google and so on.

But, you could say, 'Listen, what we don't want to have is some really big cognitive benchmark, and we don't want this thing to do better than any human on all the parts of it.' And, we just say: That thing is a dangerous and weird entity and we don't know how to control it. We don't know how to use it, and so on. And, you could literally imagine just giving this test to the next generation of AIs and people in the companies give this test and they just make sure that this thing never gets so smart that it blows every human being in the world out of the water.

1:07:41

Russ Roberts: Oh, Erik, you're so naive. You're telling me they couldn't train it to do badly[?] on the test.

I mean, seriously--I'm teasing about being naïve--but I think there's two ways--there's three ways maybe to think about regulating this that might be effective.

One way is to limit the size of corporations--which is a repugnant thought to me but if I thought the human race was at stake, maybe I'd consider it.

The second would be to do the kind of standard types of regulation that we think of in other areas. 'If this is toxic, you can't put it out. If it's toxic, you get a fine.' 'If it's right,' etc.

The third way, which I think is never going to happen, but it speaks to me, as listeners will know who have been listening to me for a long time: you'd think that if you were working on this and you thought it could destroy the human race, you'd maybe want to think about doing something different.

And, you'd give up the urge to be the greatest AI inventor of all time and you'd say, this is--and I just happened to see a tweet today. Robert Oppenheimer went in to Truman and said--Robert Oppenheimer having worked on the Los Alamos project. He was an important figure in the development of the atomic bomb--told Truman, 'I have blood on my hands.' And, Truman was disgusted by him because he said, 'I made that decision, not you, you --.' Called him a cretin. I don't know if that's a literal, accurate quote or not.

You'd think people would want to restrain their urge to find poisons, but that's never been a part of the human condition. We want to find everything. We find poison. That's why we have lab leaks. It's why we have weapons that are unimaginably destructive.

Now, we don't keep making more and more destructive weapons--as far as we know. That's an interesting parallel. There is a sort of limit on the magnitude, the mega-tonnage, of nuclear weapons. And, maybe that's a sub--I don't know how you'd enforce it, though. What are your thoughts?

Erik Hoel: Well, I think one issue with arguing for AI safety is that people sort of want at the outset--and it's a very natural want--some sort of perfect plan where it's, 'Okay, we're just going to implement this plan and it's going to work really, really well.' And, I think it's going to be much moren like--it's not going to be exactly like nuclear weapons or nor exactly climate change. It'll be like some third other thing that we as a civilization have to deal with, with its own sort of dynamics.

But ultimately, in none of those cases was there some sort of initial proposal and we just had to follow this proposal. Instead, everyone sort of had to recognize that it's a threat--again, to some degree. You can have all sorts of debates about it, but clearly I don't think anyone is just, like, 'Well, let's just get all the fossil fuels and burn them all.' I think that that's a very rare position; and the reason it's rare is because most people recognize that, 'Hey, that's probably not going to be a good idea. It might not be a good idea globally. It certainly won't be a good idea locally.'

And, through public pressure, we've managed to relatively contain some of the big existential threats that we face as a civilization.

And, a great example are lab leaks. I personally think, yeah, COVID probably did come from a lab, but if you think about all the labs doing all sorts of research all across the globe, it's actually pretty astounding that we don't have lab leaks all the time as people are using these viruses. So, we do sometimes do a middling job.

And, for big existential threats, sometimes all you need is a middling job. You just need to have a lot of eyes on an industry and people there to realize that they're being watched and to go slowly and to think about these issues. You don't need, 'I propose: Oh, we'll just have a cognitive IQ [Intelligence Quotient] test,' or something like that. I would never think that that alone would prevent these issues. But it could be part of a big, comprehensive plan of public pressure and so on.

And, I think that that's going to work. And, I think that it's unavoidable that the public wants a say in this. I think they read those chat transcripts and they go, 'What!?' This is really high level stuff. There's all sorts of moral concerns, there's ethical concerns. And, then yes, there are absolutely dangers.

And, again, I think we're at the point in the movement--maybe we're a little bit late, maybe AI safety should have started earlier--but, again, the deep learning revolution sort of caught everyone by surprise. I still think we're relatively early. I think that this is sort of like: Imagine that you personally thought that climate change was going to be a really big problem and it's currently 1970. I don't think it makes sense to then be like, 'Okay, well we're just going to do carbon sequesterization and I know exactly the technology that's needed for the carbon sequesterization.' It's better to just sort of go out there and protest and make a big deal and get it to be a public issue. That's going to be a lot more of a convincing and effective strategy than coming up with some particular plan, because it's always going to depend on the technology and exactly who has it and exactly how many people, and all sorts of things.

So, I think that that's the mode that people who are concerned, like myself, about AI safety should be in right now, which is just: public awareness that this could be a problem. Everyone can decide personally to what degree they think it will be a problem.

But what I think truly is naive is saying there's absolutely not going to be a problem. We're going to perfectly be able to control these alien inhuman intelligences, and don't worry at all about it.

1:13:12

Russ Roberts: I guess the other thing that crossed my mind is that the ability of our political system to provide thoughtful responses to existential threats: not so good. And, if anything, it seems to me it's going to get worse. And, part of the way it's going to get worse is through this blurring of the line between humans and machines that people are going to have trouble telling them apart. And, I'd like to think of something more optimistic. So, I'm going to give you a chance to play ChatGPT. I'm going to say--here's my prompt: What would Sam Altman say about all of these worries? And, Sam Altman being the head of OpenAI that just put out ChatGPT. Former EconTalk guest. You can go hear his thoughts when he was head of the Y Combinator. Long time ago here; in our archive. So, just google Altman EconTalk and you'll find that conversation. But, Sam is--he's a nice guy. I like him. He's likable. But, I'm not sure his level of worry is going to be the same as yours. Dev [?Development?] certainly has a different set of incentives.

But, I think he'd start off by saying, 'Oh, you're exaggerating.' Scott Alexander recently wrote an essay where he was alarmed at some PR [Public Relations] release that said, 'Oh yeah, there's nothing really that big to worry about. It's going to be okay. Don't pay any attention to that ChatGPT behind the curtain.' So, I'm curious what you think--still being the opposition here, if you can for a minute, Erik. And, someone like Sam, what would he say?

Erik Hoel: Well, here's a direct quote from Sam Altman, who said, 'AI will probably most likely lead to the end of the world, but in the meantime, there'll be some great companies.' So, that's a direct quote from Sam.

Russ Roberts: What did he mean? Was that tongue-in-cheek, perhaps?

Erik Hoel: I haven't looked into the--

Russ Roberts: You should.

Erik Hoel: [inaudible 01:15:25] exact [?] but I don't know. Honestly, I don't. I know that Sam has been concerned about AI safety, so this is not completely tongue-in-cheek. I know for a fact that he's been concerned about this.

Many of the people who started the initial companies were concerned about this. At the beginning of OpenAI, it started to address concerns around AI safety. There was something called the Open Letter on Artificial Intelligence that Stephen Hawking, Elon Musk--a lot of the people who provided funding for OpenAI wrote--and in it, they talk about how AI could be an existential threat.

So, this is not some sort of radical outside opinion. I think it's something that someone like Sam Altman knows.

Now, if I'm going to steelman his position, it goes something like: 'Well, I'm concerned about this. I said that AI will probably most likely lead to the end of the world; so I'm concerned about this, so I should be the one to do it. Because, if someone else who is more reckless does it--it's going to be done--if someone else who's more reckless does it, then maybe I can provide some sort of guardrails and do it in as safe a manner as possible.'

And, I really hope that that would be his motivation. And, if so, that's a great and honorable motivation.

But, at the same time, that does not inure someone from criticism. I mean, I think that in many ways, Sam Altman is now doing something very similar to what Sam Bankman-Fried--who was the one who sort of plunged FTX [Futures Exchange] in into chaos--was doing, whereas their reasoning in this expected-value way. Where Sam Bankman-Fried said, 'Well, listen, the more billions I create, the more I can donate to charity. There's sort of no upper bound. I might as well be as financially risky as possible because the expected value of my outcome is going to be so high,' right? Even though there's this huge downside.

I think Sam Altman probably reasons the exact same way when it comes to AI. I think he thinks, 'Listen, if we can make these highly intelligent things, we can have all this glorious future. All our problems are going to be solved.' Right? 'They're going to cure cancer, they're going to do all this stuff for us, and the benefits outweigh the risk.'

But, most people, when they look at an equation like that, all they see is the existential risk. They don't see, 'Okay, oh, so it's expected to be positive?' They see, 'No, but we can one day maybe cure cancer ourselves. We might not need these systems to have an amazing future, and they might just not be worth the level of risk.'

Russ Roberts: Well, you and I are skeptical about utilitarianism. Nassim Taleb, and I suspect you, understand that expected value is a really bad way to define rationality or how to live. Nassim always points out: Got to stay in the game. You want to avoid--the goal is not to maximize the expected value. The goal in these kind of situations is to avoid ruin.

Ruin, in this case, would be the extinction of the human race.

Now, there is a view that says: 'What's the big deal? It's us, by the way: We built it. It learns off of all of human creativity and sentences and words and music and art, and so it's just the next level of us.' And, for the first time in this conversation, I'll mention the word God, the concept of God. If you're a believing person--as I am in some dimension: I take the idea of God seriously--you believe that human beings have a special role to play in the world, and being supplanted by something, quote, "better" is not a goal. But, I think there are many people in the industry who probably don't feel that way, and they're not even worried about it. The end of the human species is no different than the end of those other nine cousins we had in the veldt when we extinguished them--exterminated them--through combination of murder and out-competing them.

Erik Hoel: Yes. And, I think that there is also a sense, which as I said, it might be a horrific future because maybe these things really aren't conscious at all, right? So, it might be one of the worst possible futures you can ever imagine.

Although I think that opinions like that, which are fun, sort of sci-fi things to talk about, have been acceptable because there's never actually any risk. So, my metaphor is that if you make up your own religion and you decide to worship Xanon, Supreme Dark Lord of the Galaxy, it's just a funny thing to talk about at parties. But, when Xanon's first messengers pop up, suddenly it's not funny: it's suddenly horrific, that you actually hold these views.

And so, I suspect that while there are some people out there on Twitter or some--the only people who convince themselves of things like this are intellectuals, right?--that actually we would better if the human race was destroyed and supplanted by AIs.

I think that the public generally is not going to give much thrift to those sort of things. People have kids. They might not even like the idea of there being entities. I mean, even I am uncomfortable with the fact that my children are going to grow up in a world where it is very possible that there are entities that are not just human being: Everyone knows there are people who are smarter than you at various different things, but everyone also has all their own things that they themselves are good at or that they value, or that they contribute to as human beings. And so, everyone has this inner worth, even though you can go to a university and find someone who might be smarter than you across their domain of expertise or whatever. We do not know what it's like to live in a world where there are entities that are so vastly smarter than you that they just effectively surpass you at everything.

I mean, that means that they can have a conversation that's more empathetic than you can ever have, because they're just smarter and they can just mimic empathy. Like, we don't know what it's even like to live in a world like that. Even if everything goes well and these things don't turn on us or destroy us, and nothing bad happens, it might be a minimization of human beings.

And, again, this goes to the fact that this technology has no historical analog. People will sometimes say, 'Oh, this is like the Luddites.' Or some other anti-technology group. And, the simple truth is that that was about the automation of jobs, and we were making machines that had a greater strength or dexterity than humans.

But, that's just not a problem because we didn't conquer the world through our strength and dexterity. We conquered the world through our intelligence.

We've never made machines that are smarter than human beings. We just don't know how we'll relate to something like that and what it will mean for us if and when we do it. And so, in that sense, this just can't be compared to any other form of, 'Oh, you're worried about job loss or automation,' or something like that. That is replacing tasks and that's replacing strength, and that's replacing dexterity. But, those aren't our fundamental attributes. Our fundamental attribute is our intelligence.

And, when you have something that's much smarter than a human being, it's very similar to how wildlife lives around humans. It's similar in their relationship. A human might treat wildlife well. Recently, I found an injured bunny, and I felt very attached to it because it was right outside my door and I was, like, 'Well, you're my responsibility now.' And so, I had to call animal rehabilitation. I was, like, wonderful for this bunny. And, then I went home and I ate a pizza with pork on it.

Things that are vastly more intelligent than you are really hard to understand and predict; and the wildlife next door, as much as we might like it, will also build a parking lot over it at a heartbeat and they'll never know why. They'll never know why. It's totally beyond their ken. So, when you live on a planet next to things that are far vastly smarter than you or anyone else, they are the humans in that scenario. They might just build a parking lot over us, and we will never, ever know why.

Russ Roberts: My guest today has been Erik Hoel. Erik, thanks for being part of EconTalk.

Erik Hoel: Thank you so much, Russ. It's a pleasure to be back on.

Erik Hoel on the Threat to Humanity from AI

Nick Bostrom on Superintelligence

Gary Marcus on the Future of Artificial Intelligence and the Brain

READER COMMENTS

Steve Waas

Apr 3 2023 at 12:33pm

Joseph Lukesh

Apr 3 2023 at 7:04pm

Steve Waas

Apr 5 2023 at 10:20am

Floccina

Apr 3 2023 at 3:55pm

Craig Miller

Apr 4 2023 at 5:49pm

Ben

Apr 3 2023 at 4:54pm

Shalom Freedman

Apr 3 2023 at 9:26pm

Jordan Henderson

Apr 4 2023 at 7:52am

ChatGPT BioVersion

Apr 4 2023 at 11:00am

Russ Roberts

Apr 10 2023 at 1:06am

Matt B

Apr 5 2023 at 10:02am

VP

Apr 5 2023 at 12:16pm

Lawrence

Apr 5 2023 at 1:29pm

Blake Thompson

Apr 6 2023 at 8:39am

Luke J

Apr 8 2023 at 1:02am

L Burke Files

Apr 11 2023 at 1:50pm

Earl Rodd

Apr 13 2023 at 12:24pm

Nitin

May 6 2023 at 2:05am

Erik Hoel on the Threat to Humanity from AI

Nick Bostrom on Superintelligence

Gary Marcus on the Future of Artificial Intelligence and the Brain

READER COMMENTS

Steve Waas

Apr 3 2023 at 12:33pm

Joseph Lukesh

Apr 3 2023 at 7:04pm

Steve Waas

Apr 5 2023 at 10:20am

Floccina

Apr 3 2023 at 3:55pm

Craig Miller

Apr 4 2023 at 5:49pm

Ben

Apr 3 2023 at 4:54pm

Shalom Freedman

Apr 3 2023 at 9:26pm

Jordan Henderson

Apr 4 2023 at 7:52am

ChatGPT BioVersion

Apr 4 2023 at 11:00am

Russ Roberts

Apr 10 2023 at 1:06am

Matt B

Apr 5 2023 at 10:02am

VP

Apr 5 2023 at 12:16pm

Lawrence

Apr 5 2023 at 1:29pm

Blake Thompson

Apr 6 2023 at 8:39am

Luke J

Apr 8 2023 at 1:02am

L Burke Files

Apr 11 2023 at 1:50pm

Earl Rodd

Apr 13 2023 at 12:24pm

Nitin

May 6 2023 at 2:05am

Enter your email address to subscribe to our monthly newsletter: