Episode 30: Derek Leben on Ethics for Robots and Artificial Intelligences

It's hardly news that computers are exerting ever more influence over our lives. And we're beginning to see the first glimmers of some kind of artificial intelligence: computer programs have become much better than humans at well-defined jobs like playing chess and Go, and are increasingly called upon for messier tasks, like driving cars. Once we leave the highly constrained sphere of artificial games and enter the real world of human actions, our artificial intelligences are going to have to make choices about the best course of action in unclear circumstances: they will have to learn to be ethical. I talk to Derek Leben about what this might mean and what kind of ethics our computers should be taught. It's a wide-ranging discussion involving computer science, philosophy, economics, and game theory.

Support Mindscape on Patreon or Paypal.

Derek Leben received his Ph.D. in philosopy from Johns Hopkins University in 2012. He is currently an Associate Professor of Philosophy at the University of Pittsburgh at Johnstown. He is the author of Ethics for Robots: How to Design a Moral Algorithm.

0:00:00 Sean Carroll: Hello everyone, and welcome to The Mindscape Podcast, I'm your host Sean Carroll, and today's episode we're gonna see where the rubber hits the road in moral philosophy, and I mean that quite literally. You've all heard about self-driving cars, and you may have heard about the idea that self-driving cars are going to have to solve the trolley problem. This famous thought experiment in philosophy, where you can either continue to do something and several people will die or you can take an action to prevent your current course of action and do something different and fewer people will die. Is it okay to intentionally kill a smaller number of people to save a larger number of people?

0:00:40 SC: You might not think that this is something you are going to need to deal with, but it's simply an illustration of the kinds of problems that all sorts of robots and artificial intelligences are going to have to deal with. They're going to need to make choices and in the extreme examples, they're going to need to make hard choices about, how to cause the least harm? As one example, should a self-driving car, if there are two bicyclists in the way and it judges that it's going to have to hit one of them, should the self-driving target a bicyclist with a helmet rather than one without, on the theory that wearing a helmet makes that bicyclist more safe and therefore in some sense, that person should get punished for wearing the helmet, that doesn't seem right.

0:01:24 SC: These are moral intuitions that lead to really hard problems and we have to face up to them. Today's guest, Derek Leben, is a philosopher who has written a new book called "Ethics for Robots," where he tackles exactly these questions. Not just self-driving cars, but the general idea of; what kind of moral decision processes should we program into our artificial intelligences? I think it's just a fascinating topic to think about, because on the one hand, Derek's book involves big ideas from moral philosophy, utilitarianism versus deontology, John Rawls' Theory of Justice, things like that. On the other hand, very down-to-earth questions about game theory, the prisoners dilemma, Nash equilibrium, Pareto optimality, other sort of economic and rationality-oriented ideas need to come into play here.

0:02:13 SC: So to me, it's a great example of how the abstract theorizing the philosophy suddenly becomes frightening-ly relevant to real world decisions. Personally, I do not own or have plans to buy a self-driving car in the near future, but I do think they're coming. Moreover, artificial intelligences of all sorts are all around us and have an increasing effect on our lives. Therefore, we should be thinking about these issues, and now is a good time as any. Let's go.

[music]

0:02:57 SC: Derek Leben, welcome to The Mindscape Podcast.

0:03:00 Derek Leben: Thanks so much for having me.

0:03:01 SC: So this is a topic, ethics and morality for robots and artificial intelligence, everyone's thinking about this. We all know self-driving cars are coming, and they're gonna be apparently running into people on the streets, right and left, just deciding how many people to hit. Can you just give us your short version of why it's necessary even to talk about ethics or morality for robots? I mean, do they even have ethics, should they just do all we program them to do?

0:03:30 DL: Yeah, so that's a great place to start. As we are starting to develop more and more autonomous technologies in the fields of transportation, medicine, warfare, they are starting to make these decisions that are going to have impacts on human health and safety and opportunity. And for that reason, we need to start thinking about which actions are permissible for them to do, and which actions are impermissible. And I take this to be just an inescapable fact of designing machines that are making complicated decisions about human well-being, that are going to be making these decisions without very much human supervision, right? We're just gonna have to inevitably decide what kinds of rules we want these machines to be following? Now, I'm not sure if we could use certain words to talk about these machines, like responsible or not, or blame worthy or not, but that's sort of beside the point for me. What I'm most interested in is what are the rules that we're actually going to be using to program into these machines, because that's definitely something that we need to do.

0:04:44 SC: Yeah. You have used the word decisions as if the robots have the ability to make decisions. Is that an important word, or is it just a way of talking about the fact that the robots are gonna be doing something and we have to decide what we want for them to do?

0:04:58 DL: That's interesting. For me, it doesn't really matter whether you call this a decision or an algorithm. I know that maybe some people might think of decision as something performed by an agent with free will, who could have done otherwise, or something like that. But for me that's not too important, I'm not going to get hung up in those kinds of issues. I mean, certainly some moral theories would, like a Kantian about ethics is going to say that only an agent with free will, who understands what he or she is doing is capable of being a moral agent who makes decisions at all. But that may already reveal something about my normative assumptions that I'm making. And that's something that I think we'll see as we move forward, that every kind of choice you make in programming a machine is revealing something about your normative assumptions. It's sort of impossible to stay free of ethics, to just say, "Well, I'm going to avoid ethics entirely."

0:06:02 SC: Right. Well, I agree with that very much, and also I'm happy that it doesn't matter whether we attribute free will to the robots, because I hate talking about free will and yet I end up doing it all the time, so I'm glad that we can avoid doing it for this one. Okay. So let's get one thing out of the way, which I'm sure you've been hit with before. What about Isaac Asimov, didn't he explain to us how to make moral rules for robots, doesn't he have three laws and shouldn't we just implement them?

0:06:29 DL: Yeah. So Asimov did propose these three rules, which are really just one rule, and two sort of subservient rules that say obey people and protect yourself. But the first rule is really the one doing all the work, and it says: Don't cause or allow harm to other humans. Now on the face of it, that's actually pretty good. And I think the reason why there's a sort of variety of moral theories and a variety of different kinds of rule systems that we've constructed, is that most of them do pretty well in normal circumstances. It's a bit like... Excuse me. I'm gonna make the first of many analogies to physics here, because I love to make analogies to physics and now is my chance. It's a bit like using classical mechanics in most normal circumstances, where if you're not going very fast and you're dealing with moderately sized objects, this works really well.

0:07:34 DL: However, when you get to these very sort of extreme situations, you start to see differences between the moral theories, and the problem with Asimov's laws is that it's vague, about certain situations like the violation of property rights, the violation of dignity, insulting people, does that count as harming them? Does trespassing on their property count as harming them, does blowing smoke in someone's direction count as harming them? And also it fails in these situations where every action either does or allows harm to others. And these are called moral dilemmas. So in some scenarios, you can't avoid either causing or allowing harm to others, and Asimov's law simply breaks down.

0:08:20 SC: Yeah. To be perfectly honest, is the question that I asked you was, because I feel I have to ask it, but I think you're being far too generous to Asimov's laws. I think they're just silly. The idea that a robot cannot through inaction allow a human being to come to harm is entirely impractical, like human beings come to harm over the world all the time, every robot would instantly spring into action trying to prevent every human from stubbing its toe, right?

0:08:50 DL: Yeah, exactly. And this is a point that philosophers like Peter Singer have made for decades, is that almost everything we do in the world has some kind of effect on people that we might not even be aware of. Singer has gone to great lengths to show that the way that we eat, the way that we travel, the way that we spend our money is actually having effects on other people. We could be doing other things with that money, with that food, perhaps driving around in our cars. We're not thinking about the damage that we're doing to the environment, and the future generations.

0:09:22 SC: And I also like the idea that you need to be a little bit more specific. I think a lot of people do have this idea that, the example I used in my book, The Big Picture, was in Bill and Ted's Excellent Adventure, we heard the moral rule that you should just be excellent to each other, and I try to make the point, that's fine, but it's not quite good enough. [chuckle] You haven't told us what excellence is. And actually, recently I got into a Twitter conversation with Ed Solomon who was the screenwriter for Bill & Ted's Excellent Adventure, and he was very pleased that even if I was saying it's not a good rule that it made it into this kind of consideration. And certainly, when robots are on the scene, we have to be a little bit more specific, a little bit more clear, a little bit more quantitative maybe, about what constitutes a morally correct action, is that right?

0:10:10 DL: That's right. And I agree that what you're describing is a theory that you are familiar with, virtue ethics, and it basically says, "Do whatever a noble person would do." Now, this kind of works if you have a good exemplar of a noble person handy, but then there are all sorts of problems. Like how do you know that this person is actually a noble person, whose behavior we should be trying to imitate or not. And if you are a noble person in one culture that might be a very, very terrible person in another culture. And so that leads to this problem that you were just talking about, which is; where do we look for these more precise, more quantitative, more formal approaches to designing moral algorithms? And there's a lot of different places. One place we might look is in human judgments, and try to model machine behavior after human behavior. Another place we might look is to moral theories, and try to actually take a historically important theory and implement it into a machine.

0:11:16 SC: And so, yeah, in your book, Ethics for Robots, it's a wonderful little book, everyone can read it, and it really delves into a lot of fun things. Am I accurate in saying that you contrast three major approaches here; utilitarianism, libertarianism, and contractarianism?

0:11:33 DL: That's right. Yeah. I talk about some historically influential moral theories, utilitarianism, Kantian ethics, contractarianism. And you could also include some other ones, you could talk about virtue ethics, if you're interested in that or not, but like I said, I don't think that virtue ethics is going to be specific enough to make it in into this club.

0:11:58 SC: Be a virtuous self-driving car is hard to actually implement in real life.

0:12:02 DL: It is. Unless you're doing something like, let's train the machine in a sort of bottom-up approach, as they say, to be like human beings around us. There's actually a few people like Wendell Wallach and Colin Allen, who have proposed that this is a good approach to designing moral machines. However, my objection to that is that you're probably going to incorporate all of the terrible biases and limitations of the humans that you're using to model your machine's behavior on. So these machines are probably going to wind up being terribly racist and sexist, and not thinking very clearly about consequences versus rewards and so on.

0:12:48 SC: And also, isn't it just a little bit circular? What it means to be moral is to be a moral person. I think I'm on your side here, if I'm right, that you would say we need to be much more explicit both for robots and for human beings about what it objectively means to be a moral person.

0:13:07 DL: Yes, I totally agree. So it sounds like we are both on board with virtue ethics, not necessarily being a good approach here. Maybe we can move on to what you were talking about, which is these historically important moral theories. And in your book, The Big Picture, which I also, am a fan of and I recommend, you talk a lot about how these theories are constructed around our moral intuitions.

0:13:33 SC: Right.

0:13:33 DL: And there might be different consistent, internally consistent sets of rules that you could get from different kinds of moral intuitions. And I actually think that is completely correct. I think that that is historically where these theories have emerged from. And the question is not so much, well, which intuitions should we rely on? But what is I think the evolutionary function of the intuitions that we are using in the first place. So one answer to this question, is there's a lot of these different internally consistent sets of rules that are all more or less intuitive in some kinds of circumstances, but we have no way of evaluating one versus the other, if somebody wants to be a utilitarian, and say, it's wrong to, for instance, buy a Cappuccino when you should be giving money to Famine Relief. And another person wants to be a Kantian, and say, "Well, your intentions when buying the Cappuccino were just to have this delicious coffee beverage, and therefore it's only a side effect that these people were harmed." How do we resolve the disagreement here? Now, you could just say, "Well these are just equally important and equally coherent sets of rules. But I think a better approach is let's look at the evolutionary function of the intuitions that they are drawing on, and say; is there a sort of unified framework that matches that evolutionary function, that original goal of the system?

0:15:17 SC: Right. I think we agree on a lot of things, but maybe we disagree on the meta-ethical question of whether moral rules are objectively real or not, is that right?

0:15:28 DL: That's right, and a lot of this hinges on what we mean by real. So I think I mean...

0:15:33 SC: It usually does. Yes.

[chuckle]

0:15:34 DL: Yeah, exactly. When I say real, I mean real in the sense that smoking is bad for your health, is a fact. It's a real fact about human beings, and it's dependent on certain goals that we all share. Now, this is something that I think you're inclined to agree with too, that if we talk about morality as a set of as the philosopher Philippa Foot once said, hypothetical imperatives, as a set of sort of "if-then" statements, if you want to be X, then you should do Y. If you wanna be healthy, then you should exercise and eat right and so on. Then we can have objective answers to this question within the domain of us all sharing these goals. However, I am in agreement with you.

0:16:16 SC: And I completely agree with that, right.

0:16:18 DL: Yes. And so outside of these goals that we all share, there's no way of talking about one state as better or worse than another one. And that's where I think you and Sam Harris, might disagree here, and I'm on your side of that debate. He has this thought experiment that he calls the worst possible suffering for everyone, where he says, Imagine everyone is in total misery all the time. Clearly that's objectively bad. But I think it only makes sense to say that that's objectively bad if we already are within the realm of caring about suffering and avoiding suffering.

0:16:55 SC: Yeah. To me, it's much like saying, "Well, aesthetic judgments are objective, because if we imagine the world's ugliest painting, everyone would agree that it's ugly, therefore it must be objective. I think it's okay to admit that we're contingent human beings made of atoms obeying the laws of physics and this thing called our moral goals are very reliant on historical contingent things, like different people could have different goals, but happily, we agree on a lot, and we can build from what we agree sensible moral systems. Okay. Sorry. I ended by saying, we can build from what we agree, some sensible moral systems.

0:17:42 DL: Yes, I totally agree. Now the question is is which of these internally consistent and all apparently intuitive different sets of rules should we select from. And we're going to have to make choices like that, because as you said, in constructing a self-driving car, we're going to have to decide which paths that result in collisions are better and worse than others. And it would be easy, it would be really wonderful, if we could just avoid all collisions, that would make my job completely unnecessary, and I am fine with that. If it turns out that we can avoid any kind of harm to anybody, then we don't need ethics at all. But every time the machine is going to be evaluating one path and comparing it to another, it's going to have to decide which collisions are better and worse. Now, there's a lot of ways of doing it. It could, for instance, consider the driver and the passengers of its vehicle as more valuable than the other passengers, it could consider swerving to hit someone as better or worse than continuing straight, it could view hitting and injuring more people, as better or worse than fewer people.

0:19:04 DL: Now, some of those might seem more obvious to you or less obvious to you, but the problem is how do we actually resolve this? And I think we need a moral theory and for that like I was saying, we need some framework of comparing these theories, and I think the only way of doing that is by saying; Which moral theory actually fulfills this meta-ethical assumption that moral theories evolve for the purpose of promoting cooperative behavior among self-interested organisms? If you have a moral theory that actually is better at creating cooperative behavior than other kinds of theories, then that's the one we should use, I propose, in designing these vehicles.

0:19:45 SC: Yeah. Maybe it's good to clear up this issue before moving on to the specifics of the different moral theories. There is a subjection out there to this kind of discussion, that says, "Who cares? Cars are not really gonna be solving trolley problems, if they see something bad, they're just gonna hit the brakes and stop. And so this is all kind of irrelevant. And I suspect that's just an impoverished view of how moral reasoning works, and people get annoyed with philosophers for inventing trolley problems, because they say, "Well, I just don't wanna play this game." But it's a way of sharpening our intuition, right? And actually it would be useful here if you laid out what a trolley problem is. It's conceivable that some of the audience doesn't know. And you also, you mentioned in the book that Harry Truman's decision to drop an atomic bomb on Japan, was very much like a trolley problem.

0:20:37 DL: Yes, that's right. So these do happen, and unfortunately they happen quite often in certain professions, like in medicine, and warfare and business, where you have to make choices about causing harm to one person or allowing harm to many other people. Now, this sounds very strange to most people like me because I think about myself as, well, I'm just going through my day. I'm buying Cappuccinos. I'm grading papers. I'm watching Netflix. What am I doing that is causing harm to others? But in fact if you're buying that Cappuccino, you are weighing your own pleasure against the happiness and suffering of other people who you could be donating that money too. If you're watching Netflix, you could be doing more with your time. If you're eating meat, you're making judgments about whether animals are valuable or not. And the trolley problem is one of the scenarios that was constructed in order to demonstrate the differences between... Well, initially, between doing and allowing and killing many people versus one, or I should say sacrificing one person to save many.

0:21:49 SC: Right.

0:21:50 DL: In the scenario; you have a runaway train, going down a track, it's gonna hit five people, and the only way to stop it is to either divert it onto a side track, where a single person is, or in another condition, to push a large man in front of the train and stop the train. Now, it turns out that most people say it's permissible in surveys to switch the track onto the single person, but not permissible to push the large man to his death, even though that seems strangely inconsistent. If you are only considering the consequences, then it is the same exact effect. It's killing one to say five. But if you think that actually performing some kind of physical intrusion into somebody else's personal space is important, or if you think about this in terms of a causal chain or your intentions or something like that, then maybe there is a difference here.

0:22:45 DL: So the trolley problem is one way of comparing what different moral theories might say. Utilitarians will say you always kill one to say five. A lot of deontologists, what they have in common, aside from just focusing on rights and duties, is that they often say that there's nothing wrong with just standing back and allowing things to happen. So this is called the difference between a positive and a negative obligation. The philosopher Robert Nozick, made a big deal about this, and said, "Look, it's wrong of me to push you in front of a train, it's not wrong for me to allow you to be hit by a train." And so, according to most deontologists, it's wrong for me to push you in front of a train and it might also be wrong for me to pull that switch causing the train to go on to you.

0:23:34 SC: But I think that it's a very helpful illuminating thought experiment, precisely because how we viscerally think about what's morally right and wrong, might not match with what we say we're thinking right, that's what the pushing the guy off of the footbridge really brings home. That if you would ask someone abstractly is if given two choices, would you let five people die or one person die? They'll say, yeah I would let the one person die. But then when you make it concrete, you have to actually kill this one person to save the other five, they're unwilling to do it, and that's kind of okay, and I don't think it's irrational, it's just revealing that our moral intuitions are not always very coherent or matching up with our Moral Cognition.

0:24:18 DL: Yeah. And as someone who's taught ethics classes to undergrads for a long time, I'm very familiar with how sort of inconsistent people's everyday moral intuitions are. If you press just a little bit on some of these trolley problems, like, "Well, it's okay to pull the switch, and in fact you should pull the switch to kill one to safe five. But what if it was your mother or your father or your best friend on the side track?" Well, then people change their minds. And then you asked them, "Well, why do you think it's okay for you to value your parents over other people's parents? Surely these people are equally valuable. You're not saying that your family is actually better than other people's families." But it turns out that people's moral judgements actually are sensitive to this, there was a fantastic experiment done by some researchers, led by April Bleske-Rechek, and they found that alternating by condition, the genetic relatedness of the person on the side track, will alternate whether people are willing to pull the switch, almost exactly as you would predict by the proportion of their genetic relatedness. So brother, cousin and so on.

0:25:28 SC: What is the famous joke? I would sacrifice my brother or sister to save two cousins or something like that?

0:25:35 DL: Yeah, exactly, I think, that was the quote from... Was it Herbert Spencer or something?

0:25:39 SC: Yeah, that's right.

0:25:41 DL: And so, but most moral theories agree that genetic relatedness should not play a role. And in fact, if you press people and ask, "Well are you saying that people who are genetically related to you are better than the people who are not genetically related to you? They'll say, "Well of course not, that's silly. I would never say that." But by acting in that way, they are revealing that judgment is actually playing a role in their behavior.

0:26:06 SC: Yeah. And I think that, again, part of it is that a lot of people have a presumption whether it's explicit or implicit, that the right kind of ultimate moral theory will be something utilitarian, but maybe it's not, maybe it's perfectly okay or at least let's say, not maybe, but let's say I can imagine a perfectly coherent moral theory that very explicitly gives more credit to people who are closer to me, when I come to saving lives.

0:26:35 DL: Yeah, exactly. The utilitarian has to say some really, really weird things, but the other moral theories also sometimes have to say some really, really weird things. And this is what you have to get used to, is that no consistent set of rules is gonna give you everything that you want all the time. And my theory that I'm advocating also tells me some things that I really don't like and I think is really weird.

[chuckle]

0:27:02 SC: Why don't we give you a chance to explain what your theory is, which is not completely yours, we've building on quite a tradition here.

0:27:08 DL: That's right. I am advocating a moral theory that is drawn from a tradition called Contractarianism, and the most recent version of this that I'm using was from a philosopher, an American philosopher named John Rawls. And in his book, A Theory of Justice, from 1971, he proposed that the best way of designing a fair society is to imagine that we were in this original position, where I don't know who I'm going to be, I could be anyone. And in this kind of idealized bargaining position, he thought we would all come to agree on certain basic distributions of what he called primary goods. And the distinction between primary goods and secondary goods are that when you don't know who you're going to be, you don't know if you're going to be male or female, tall or short, handicapped or perfectly abled, and so on. And if that's the case, you don't know what kinds of particular things you're gonna value. Are you gonna like television, are you gonna like coffee? Maybe; Maybe not. These are all secondary goods. Primary goods are the kinds of things that all human beings value no matter what; that all human beings have to value in order to pursue any kind of goal at all.

0:28:33 DL: And this list includes things like your life, your health, your opportunity and essential resources for survival. So no matter what you wanna do with your life, if you wanna be a juggler, if you wanna be a lawyer, if you wanna be a physicist, you need to have essential resources, opportunities and health, right?

0:28:55 SC: Yep.

0:28:56 DL: Going back to our previous discussion, these are the kinds of things that we have as, you could call them common ground, that all human beings as a matter of fact, just by virtue of being human beings care about. And if someone says, "Well, I don't value these things I could say yes you do, you're a human being and you pursue goals and so you care about your health and safety, and opportunity.

0:29:20 SC: I actually took a class with John Rawls in graduate school.

0:29:23 DL: Really?

0:29:24 SC: Yes. It was very funny because... Well, I audited the class, but I went to those sections and everything. And I remember one day walking with a friend of mine across campus, and the people who had actually taken the class for a grade, were coming out of the final exam, and I just ran into them and so I said, "Hey, you know how did the exam go?" And they were like, "It's very fair."

[chuckle]

0:29:47 SC: Which of course my friend who is like, that must not be your physics friends. Those must be your philosophy friends, because no physics people have ever come out of an exam saying that was very fair. But Rawls', his whole thing, was justice as fairness and trying to make things as fair to everyone as possible.

0:30:02 DL: That's right. And Rawls is primarily known as a political philosopher, because the kinds of things he was talking about designing from this original position were mainly policies and the structure of our government and social institutions. However, towards the end of the book, he talks a little bit about using this as a frame work for individual decision making. And that's what I want to be doing. I wanna say that we can also use this as a way of thinking about what kinds of actions are wrong, and what kinds of actions are permissible. From the original position, Rawls said we would all agree on a certain distribution principle that he called the Maximin Principal. And the Maximin Principle has a history from Game Theory. And in this context, it just means we would agree on a distribution which makes the poorest person as best off as possible.

0:30:54 SC: Yeah, actually, I do wanna I get into this, but I realize now while you're saying this, there's a prior thing I wanna just touch on very quickly. You mentioned the fact that I think that Rawls himself would have cast his theory as a political one, right? A way to organize our society. In fact, only a well-ordered society, he admits that there would be cases where things were in extreme distress, where you'd have to violate his principles, but the idea as I understood it was that we could disagree on basic moral conceptions, but if we agree to live together in a liberal democratic polity, then he had these rules for how to reconcile our different moral conceptions. And so you're going a little bit farther, because you wanna actually use this as a theory of morality as well, right?

0:31:42 DL: That's right. Well, I wanna say that there are many kinds of values that are equally comparable to each other, however, those all exist within a kind of space that is constrained by essentially this moral decision procedure. So there are lots of equally good distributions according to the Maximin Principle, and within that space of equally good distributions of primary goods, then we could go ahead and impose many different kinds of values and have interesting disagreements about which sets of values are the ones that are best. But importantly, that's all occurring within the space of a sort of maximin constraint.

0:32:28 SC: But it is a little bit... It's asking you a bit more, right? Because the original position is something where we forget some things about ourselves, like you said, and we remember other things ourselves, the difference between primary versus secondary goals. And this seems potentially more problematic if we want to get out of it moral rules rather than just a political system. If someone is in real life, very religious and has some religious convictions that strongly flavor their ideas of right and wrong, are those convictions things they will have to forget in the original position, and wouldn't that...

0:33:04 DL: Yes.

0:33:05 SC: Wouldn't that lead them into something that's a moral theory coming out of the original position, that is very different than the one they actually have?

0:33:12 DL: Probably, yes. But I don't think we need John Rawls to convince us that religion is irrelevant to ethics. I think just some basic assumptions about what we mean by making moral choices can do that. And this goes back to the dialogue, Euthyphro by Plato, where he plausibly demonstrated that; look, even if God were to say that slavery and child abuse and rape are morally good, that doesn't make them good. And so usually when I'm talking to somebody who thinks that morality is based on a certain set of religious beliefs, it takes about 90 seconds of talking to them to get them to finally admit, "Well, yeah, I guess you're right, that it doesn't really matter what the religious text says, what matters is something else."

0:34:02 SC: I think that I'm on board with the... I can never pronounce it Euthyphro?

0:34:07 DL: Euthyphro. Yeah.

0:34:08 SC: Euthyphro dilemma, and why there should be some criteria for morality other than what God says. But nevertheless I think that I could imagine that the actual moral beliefs that a religious person has are different, not because God gave them the moral beliefs, but because their religious beliefs affect how they think about the world, how they think about the ontology of reality. If you believe that human life begins at conception, you might have different views on abortion than if you believe it's just a bunch of cells obeying the laws of biology.

0:34:41 DL: Yeah, that's interesting. I think the abortion case is difficult, and in fact, people often, when they're talking about ethics, jump right to abortion, which is one of the most complicated moral topics. And I usually like to point out, "Well, you're basically starting off at the introduction of the book, and just skipping right to the most complicated problem at the very end."

[chuckle]

0:35:04 DL: Right? There's a lot of stuff that goes on in between. And most problems that we face are actually I think ones that are plausibly ones that have good moral answers to them, like, as I mentioned, slavery, child abuse, rape. And then of course, we get to more difficult cases like charity and eating animals and driving a fossil fuel vehicle, which I think are in fact, things that most people think are may be morally permissible or wrong and are probably mistaken about. And then we get to very, very difficult cases like abortion, which even if there is not an answer to that, and I think there is probably an answer to it, but even if there is not, that doesn't invalidate everything that sort of went up to that point.

0:36:00 SC: Okay. But I'm just trying to get on the table the idea that when we get back to the self-driving cars killing people, solving their little trolley problem as they're going down the street. I could at least imagine that people have deep-seated moral convictions that wouldn't qualify in Rawls' as conception as primary goods. And they might object to having those conceptions stripped away of them as we put them in the original position.

0:36:28 DL: I think that's correct. But I also think it's correct that any group of people who are interacting with each other are going to have certain beliefs that are not respected in the process of interacting. And so, if we are having a civil society together, just to take the political case, then inevitably some of your beliefs are probably going to come into conflict with other people's belief and with the institution. So if I have a religious objection to say, oh, I don't know, respecting other people of different races, then you're going to say as a government, no you actually have to treat everybody equally, and it doesn't matter what your religious beliefs are.

0:37:11 SC: Right. Yeah, no. I think that the reason just I harp on this, is I've become very, very interested in this potential conflict between fundamental moral positions that individuals might have and the goal that we presumably share of living in a liberal democratic society. And I think that we tend to paper over them, these differences a little bit, but I think we should respect that they could be true conflicts and have to eventually say something like what you just said, which is that, "Yeah, suck it up. Some people are gonna have to have to make compromises if we're gonna live together like this."

0:37:46 DL: Yeah, I totally agree. And I think instead of phrasing it, I would also say suck it up, but that's sort of the Pittsburg in me.

[chuckle]

0:37:54 DL: I think that a more congenial way of phrasing that is, that you might have very different religious values from me. But there are a set of values that we all share, and what we need to base a moral theory on is the sort of universal grounds that we all have in common. The kinds of things that enable all human beings to pursue their goals.

0:38:18 SC: You're right, that's a much more public relations friendly way of putting the points.

[chuckle]

0:38:22 SC: So I think that's a wise way of doing it. Okay. I know you wanna get to the Maximin Principle, so do I. But maybe even before we do that, I really liked in your book, the casting things in terms of game theory and prisoners dilemmas and Nash equilibria, and Pareto optimality, and all these other buzz words. And I think that this is why we have an hour long podcast, so we can actually explain a little bit how you're thinking, 'cause it's a very helpful conceptual tool. So why is game theory something that is useful tool to have in mind when we think about these issues?

0:38:58 DL: Yeah. I am really excited to talk about this because I think this is the way forward in resolving these kinds of tensions between different consistent moral theories. This kind of tool was only available to philosophers for the last 50, 60 years, or so. A lot of times when I tell people that I work on designing moral frameworks for machines, they'll say, "Well, haven't philosophers been doing this for thousands of years and they haven't gotten anywhere." Now, my first response is, "Well, just because a problem has been around for a long time, doesn't mean it can't be solved." And the second response is, "Actually, I think the tools for solving these kinds of problems have really only emerged in the last 50 or 60 years, or so." And when we talk about evaluating theories based on which one promotes cooperative behavior, there's been a lot of talk in the history of philosophy about promoting co-operation. The British philosopher Thomas Hobbs, talks a lot about if we didn't have any kinds of rules we would need to invent ones in order to cooperate. You and I have been talking about living in a civil society and cooperating together. But exactly what does this mean when we talk about co-operation? Well, there is a very, very technical way of describing this, and you can describe it as a kind of improvement from simple self-interested behavior.

0:40:24 DL: Self-interest is actually a very powerful tool, and in the 1950s, John Nash described a certain method of describe of showing how self-interested agents would come to certain equilibria in interactions and games with each other. And of course, games doesn't just mean poker and blackjack, but it can just mean any situation where two or more people are interacting and there are gains and losses for those people.

0:40:53 SC: But poker was a big influence, right? That was a big inspiration. [chuckle]

0:40:55 DL: That's right. Well, it also includes games too.

[chuckle]

0:40:57 DL: Yeah. So what we're talking about here are cases say where... The prisoner's dilemma is usually phrased in terms of sort of cops and robbers drama, where let's say that I arrest you for drug dealing, and I know you're dealing drugs, but I don't have enough evidence to convict you. Now, I make you a deal, you and your partner a deal in separate rooms. I say, "Look, if you will confess to the crime. I'll let you off free, but I'm gonna put your partner away, for good." And you know if you both stay quiet that you actually get, let's say, a very low sentence, if both of you confess, you both get a medium sentence. And it turns out in this kind of scenario, a lot of people might think it's intuitive that you should stay quiet. But according to Nash, you should both confess, you should both squeal on each other.

0:41:57 SC: Right.

0:41:58 DL: Now, what's weird about this is you've got a conflict, where it turns out there is an improvement, if both of you confess to the crime, then you both get a medium sentence, but if both of you were to stay quiet, you would have both gotten a low sentence. Now that's what's called a Pareto improvement, because it is improvement for everyone, it doesn't make anyone worse off. And these are the kinds of improvements that economists think are the bare minimum for rationality. Like if I finish my lunch, and you're hungry and you're sitting next to me, I've still got some leftovers. It seems obvious that I should just give some to you. It doesn't make me any worse off and it makes you better off, it's a Pareto improvement: It's from the Italian economist, Vilfredo Pareto.

0:42:47 DL: And so, I'm defining cooperation problems as ones where self-interest here, and self-interest, I'm measuring as a Nash equilibrium. In fact, there are lots of different situations that have different kinds of Nash equilibria. You could have multiple Nash equilibria. But there exist Pareto improvements, from Nash equilibria.

0:43:11 SC: If I understand it correctly, the Nash equilibrium is one where one person cannot unilaterally change to get a better outcome without hurting somebody else, but a Pareto improvement would be where, if we all change at once, we will all be better off, is that right?

0:43:29 DL: That's exactly right, and that's the challenge, is that Pareto improvements from these Nash equilibria, are not things that self-interested rational agents can do, they're not capable of it. And so...

0:43:40 SC: So it's a cooperation problem.

0:43:42 DL: Exactly, there's a problem. And this is the challenge that Thomas Hobbs in the 1600s was describing, is that people in their own self-interest are going to be led to these outcomes that are actually not optimal for everyone. And so we need this sort of third-party, as he called it the Leviathan or maybe a set of rules or a government, to come in and force us to act in a way that's for everybody's mutual benefit.

0:44:10 SC: Right. As you just alluded to, being Pareto optimal is kind of something that nobody could disagree with, right? It's not necessarily the final answer to our right thing to do, but if there is something where if everyone acted in a certain way, literally everybody would be better off or at least the same, then how could anyone object to that, right?

0:44:32 DL: Exactly. And it's the kind of thing that you would expect in the evolutionary history of our species and other species as well, would motivate certain adaptive traits to emerge, would actually lead certain traits to emerge to force us to cooperate in places where we wouldn't have cooperated before.

0:44:54 SC: And you go on to propose the repugnant prisoner's dilemma, [chuckle] as an illustration of how straightforward utilitarianism can lead us wrong, and it's kind of a version of the utility monster thought experiment that I guess, was it Nozick, who proposed that?

0:45:10 DL: Yeah.

0:45:11 SC: And where if one person can become way better off, and everyone else suffers just a little bit, straight forward utilitarianism would say, "Yeah sure, let everyone suffer just a little bit, 'cause this one person would be so much better off." And you would argue that that's probably not the moral strategy we wanna pursue.

0:45:28 DL: That's right. The great thing about defining co-operation in this very formal sense is that we could actually go on and test which of our moral theories produce more and less cooperative solutions. I think that in most cases, like the prisoner's dilemma, the regular prisoner's dilemma, it turns out that utilitarianism, contractarianism, natural rights theories, Kantian ethics, they all produce the correct result, they all produce mutual cooperation, which is great. And I think our moral intuitions, this sort of mixed bag of cognitive mechanisms that have over time evolved to make these kinds of choices, that they also, in most situations do a great job of promoting cooperative behavior.

0:46:19 SC: Except maybe libertarianism does not get the same answer for the iterated prisoner's dilemma?

0:46:25 DL: Maybe, yeah. It depends on how you define causing the harm. This is something that Nozick talks a lot about in Anarchy, State and Utopia. Which was his rebuttal in the '70s to Rawls. He said that if you're talking about causing harm as sort of doing something where if you had done otherwise, she would have been worse off, then in this case, maybe the prisoner's dilemma is not an instance of causing harm to the other person if you confess or if you stay quiet. But it's difficult to say in that scenario really what counts as causing harm.

0:47:09 SC: But nevertheless, I think I take your point that for the conventional prisoner's dilemma we have a Nash equilibrium where everyone defects, but most sensible people would say that both players in the game are better off if they cooperate, and so we can have that as a starting point of agreement, and work from there.

0:47:25 DL: Yeah. And it's even more than most sensible people. It's over time. If you are a utilitarian, or a contractarian playing prisoner's dilemma, over and over and over again, you will get better outcomes, if you're measuring this in money you'll make more money over time. If you're measuring this in children, you'll have more children over time.

0:47:46 SC: Right. Okay. Now I'm gonna finally let you tell us what a good contractarian believes. You mentioned the maximin principle, but how is it different? How would a good contractarian approach something like a prisoner's dilemma or other sorts of games, differently than a straightforward utilitarian would?

0:48:02 DL: Right. The maximin principle says that we should prefer the distribution that makes the worst off person as best off as possible. And usually what that means is you have a set of outcomes, you attach values to each of those outcomes, number values, and in each of the outcomes you pick the worst off, then you put those into a group and select the highest of the lows, and that's the one you pick. Now in terms of distribution of money, that's fairly straight forward, how you count and quantify those distributions, but in terms of other kinds of goods, it might be a little more complicated. However, utilitarians have been spending years, decades, centuries, to try to convince us that pleasures and pains can be quantified and counted.

0:48:55 SC: And added up.

0:48:58 DL: And added up. That's right. And so the utilitarian wants to run essentially a summation function over all of this, and just pick the highest of the sums. Now those usually produce very similar answer. So in the prisoner's dilemma, it turns out that adding up all the outcomes and running Maximin, both say that we should cooperate, with each other.

0:49:21 SC: Right.

0:49:22 DL: But in other scenarios, you could arrange it. And you mentioned the repugnant prisoner's dilemma that I set up, you could arrange scenarios where the sum of all the pay-off for people is actually not either what maximin would say nor is it what Pareto optimality would predict over just self-interested behavior. And for that reason, I think that the maximin principle is actually the better principle than the utilitarian one.

0:49:52 SC: And it's precisely because of this possibility that there could be great gains for one person, but other people have to suffer because of it.

0:50:00 DL: Exactly. And that's almost a prediction of the theory, not necessarily a motivation for it. The real motivation for it is that it produces cooperative behavior in all scenarios. The prediction is that it will make the worst off as well off as possible. Usually this makes a lot of sense intuitively. In a, like you said, a liberal democracy, very often progressives are wanting to benefit the poor before we benefit the rich, that they should have priority. So this is often called a prioritarian principle. However, there are some other situations where it wouldn't be prioritarian. I think where the rubber meets the road here is when you start actually attaching values to outcomes, and you have to do that by saying; here are the primary goods. What I was saying earlier, are the kinds of goods that all human beings from the original position would care about, our health, our safety, our opportunity, and you try to quantify them and then calculate the effects of your action on those goods.

0:51:11 SC: Yeah.

0:51:11 DL: So if I say, look, if I'm going to punch you in the face, I might get some amount of pleasure from that if I'm some sadistic weirdo, I don't want to do that. But if I did, it would be terrible in terms of your health and opportunity, and that loss to you in primary goods is not equivalent or not made up for the gain to me and the secondary goods that I get, namely my pleasure or something like that. And so when we talk about applying this to self-driving cars, what we need to do is we need to have a way of quantifying the effects of every collision on the health and safety of the passengers of pedestrians of people in other cars, and then what a contractarian would do is run a maximin function over all of that, and say, "Here are three different collisions. What are the worst health outcomes in each of these collisions?" And I'm going to pick the best of the worst case scenarios.

0:52:11 SC: So that sounds like something sensible, but just before we dig into that, I think it's safe to say that in the political sphere, where Rawls was originally talking about, his reasoning does seem to lead us to quite redistributive way of running society. The very worse off people have to be improved by any inequality that we allow. So it's a very different world than where we actually live, where in modern capitalism in the United States, there's plenty of people suffering a lot with the idea that there's other people who are doing really well off because of economic growth and that's what a trade-off we're willing to make.

0:52:53 DL: That's right. Now, if you're utilitarian, you care very much about the suffering of these other people, so you use the word suffering. But if I'm a contractarian, I don't care about their suffering. I care about the distribution of primary goods, namely their health, their safety, their essential resources. And so what I care about is making sure that the worst off people in the population are brought up to a minimal level of let's just call it normal functioning. Now, there's a lot of discussion about this in bio-ethics, about what is normal functioning, how do we quantify normal functioning? But that's essentially what a contractarian is trying to do, to bring everybody up to a minimal threshold of opportunity and safety, but not happiness. In fact, contractarians don't care about happiness. Happiness is not the good that we are calculating.

0:53:46 SC: Well, but for Rawls, certainly wealth that an individual has would be among the goods that we do calculate, right? When we talk about gains.

0:53:54 DL: Sure.

0:53:55 SC: The difference principle saying that inequality should only be allowed to the extent that everyone is better off, wealth is among the things that makes us better off.

0:54:03 DL: Sure. But only to the extent that wealth is able to get you the essential resources you need to pursue goals. If you are a masochist and you enjoy suffering, that's fine, as long as you have enough essential resources to continue being a masochist, then that's all a contractarian cares about.

0:54:22 SC: Sure. That's right. I'm just trying to... Because when we get to do the self-driving cars, there will be competing conceptions of what the car should be doing. So I just want people to know their analogous competing conceptions in the political arena. Rawls, he in at least at face of value, would be much more democratic socialist, whereas a libertarian would be much more capitalist, in terms of how the economy should run itself. And these are both plausible theories that we can argue about.

0:54:52 DL: That's right. Yes.

0:54:54 SC: Good. When we come to the cars, you're going to try to implement some kind of maximin algorithm in the mind of a self-driving car.

0:55:05 DL: That's right. So I think there needs to be a database of collisions, and the effects of these collisions on most people of comparable let's say size and position, right? Now this is something that you might think is really, really complicated, and even maybe a little bit silly. But I think the alternative is even sillier. Right now, a lot of the major car companies have the official position of just saying, "Well, we think all collisions are bad and we want to avoid them all equally," but I think that's an incredibly ridiculous position to take because not all collisions are equal. Obviously, getting hit by a vehicle moving at two miles an hour is better than getting hit by a vehicle moving at 20 miles an hour. And I want vehicles that are evaluating different paths to say that one collision is better than another.

0:56:00 SC: Yeah, no. I don't even quite understand the resistance to this way of thinking. If someone says, "What is the best economic system, and someone else said, "Well, it's the system where everybody is wealthy," that would not be very convincing to anyone. You're like, "Well, that's not the world", that we have to make some hard choices here. We should at least anticipate the reality that cars are gonna be making some hard choices.

0:56:25 DL: I think that's the reality, that is slowly coming, but I think it's sort of a public relations nightmare, for an industry that is already working hard to just convince people that these things are safe at all, much less to convince them that they should be evaluating which kinds of collisions are better and worse.

0:56:44 SC: And there is, I think that you made this point in the book, that hadn't quite sunk into my brain before reading it, which is that, neither you the human driver, nor the car, the artificial intelligence, can say with perfect certainty what the outcome of a decision is going to be. Therefore, even if it's rare that someone actually gets run over, a car will constantly will be making decisions between higher risk and lower risk actions, and that is really quite down to earth and it's gonna be common, right?

0:57:17 DL: That's right. I've talked to a few people who are designing autonomous systems mostly in academics, not in the industry, industry doesn't wanna talk at all about this kind of stuff.

[chuckle]

0:57:27 DL: And I can understand why. But a lot of the people in academics who are working on this technology, I talked to for instance, Benjamin Kuipers, who was building along with his former postdoc Jong Jin Park, they built this wonderful autonomous robot that moves around the halls of the University of Michigan and detects obstacles and slows down and tries to avoid them. And Park used this system called model predictive control, where it essentially casts out a net of many, many, many possible paths. Many per second, and then it prunes those paths based on the likelihood that each of them is going to result in a collision. Now, likelihood is a really great method to evaluate paths. I want to take the paths that are least likely to result in collisions. But once again, I think we need more than just likelihood. I think we also need to say a likely collision with a pedestrian is worse than a likely collision with a tree.

0:58:25 SC: Right. Yeah exactly. How simple and straight forward, does this suggested algorithm become when it comes to things like trolley problems or things like babies versus grown-ups or anything like that. I mean there still seems to be a lot of wishy-washiness there.

0:58:43 DL: Yeah, so it will tell us what kinds of information is relevant in making this database in the first place, which I think is really important. There was a recent experiment conducted by the MIT Media Lab that you're probably familiar with, it was just published in Nature a couple of weeks ago, and it was called the moral machine experiment.

0:59:02 SC: Right. Yes.

0:59:03 DL: And so what they did is they asked people to make choices about self-driving car trolley problems, where they alternated things like the gender, the age, the social status of all the people involved. So would you rather run over two doctors and a homeless person to save one obese man and a dog or something like that. Now the contractarian, as well as most moral theories are going to say, all of that information is irrelevant or most of it is irrelevant. So whether a person is a doctor or a lawyer, whether a person is a Muslim or a Christian or an atheist, all of that is irrelevant. But what is relevant, are things like your physical position, your physical size, and maybe your age, because that information actually tells us about the effects of this collision with you. And so this is important in figuring out what kinds of databases are gonna be discriminatory against people and what kinds are not.

1:00:05 SC: Well, you brought up this very interesting question. Do you discriminate against people on motorcycles, who are wearing helmets versus those who are not? Because presumably a collision with someone wearing a helmet will hurt them less than someone not wearing a helmet. So we actually punish them in some sense.

1:00:23 DL: Yeah. And that's one of the more counter-intuitive predictions of my theory. My theory says a lot of things that I find pretty intuitive, but a few things that I find counter-intuitive. And unfortunately, if my theory is correct, I just have to say, "Well, so much the worse for my intuitions here." Part of the problem might be the use of the word punish. So that's a little bit misleading, if the car is going to evaluate a collision with a bicyclist without a helmet as worse than a collision with a bicyclist with a helmet, that doesn't mean that it hates the one with the helmet, or that it thinks that the one without a helmet deserves to die more than the one without a helmet. It's only saying this path is less dangerous than that path. And the reason why I agree with you, that seems really weird, is that you think, well, the person with the helmet was being safe, she's the one who left the house that day taking precautions, why should she have the car target her or punish her more than the other one? And the answer is, I think we need to stop using words like target or punish. And just say that the path that leads to you was evaluated as better than the path that led to her.

1:01:47 SC: Okay. Good. And so I think, yeah, there's two big looming questions I'm not quite clear on here yet, but I think we can clear them up. One is, you seem to be saying that the contractarian just treats every human being equally, roughly speaking, maybe there's some health differences, like maybe a strong person wouldn't be as bad to get an accident with as a weak person, because they're more likely to survive it. But this is contrary to how many people's intuition goes. One of the aspects of the MIT study if I remember correctly, was that different people from different parts of the world gave different answers for injuring women versus men, young people versus old people, etcetera. But you're saying that you're advocating being ignorant over all of that.

1:02:33 DL: That's right. So if people are preferring to collide with men over women, my response would be that's sexism, and that's not something we wanna incorporate into our machine.

1:02:44 SC: Right. Good. Some people are not gonna agree with this, you're gonna have to try to convince them, but that's okay.

1:02:50 DL: Well yeah. I have to keep stepping back until we find some grounds that we could agree on. I'll step back one step and say, "Okay, well, what moral theory are you using? And in just about any moral theory, you're not going to value men more than women or vice-versa. Not utilitarianism, not Kantian ethics, nothing. And if they say they still do, well I take a further step back and say, "Okay well, how should we even make decisions in the world, right? Should we just base off of the things that we all have in common, and if we're agreeing on that, then I'm going to say contractarianism is the best way of cooperating based on the values that we all share.

1:03:32 SC: Okay, what about babies versus grown ups?

1:03:35 DL: Babies versus grown ups is difficult because a grown up, and when we're just talking about collisions, is more likely to survive a crash than a baby. And so in that case the baby should be preferred but not because we love babies more or they're more adorable, but because they are more vulnerable.

1:03:54 SC: Okay. That makes sense. Good. And then the other looming issue was, let's be explicit about how we come down on the various trolley problem kind of scenarios here. It sounds like contractarianism doesn't really care if one person versus five gets injured because of an active choice versus a passive one, right? It is a consequentialist point of view at the end of the day.

1:04:18 DL: Yeah, that's right, that's right. I do think we need to evaluate outcomes based on the distributions they produce and that is a kind of consequentialism. And so in that way I think utilitarianism and contractarianism are sort of cousins in this respect, but I think the biggest difference is their way of quantifying the goods. Do they quantify happiness and suffering or primary goods, and do they run a summation function or a maximin function. And so in say the trolley problem, in most cases they're going to agree. But in some cases they're gonna disagree, and in some cases I find it really weird.

1:04:58 DL: Here's one of those cases. According to my theory, this is something that a friend of mine, the philosopher Susan Anderson pointed out, she pointed out that according to maximin, it would be better for the car to swerve into a crowd of 50 people and give them all a broken leg, rather than to swerve into a brick wall, and give the passenger in the vehicle, two broken legs. Why is that? Because 50 single broken legs, is better than one instance of two broken legs. And I find that so strange, I find that crazy, but once again, I just have to say, just like Jeremy Bentham did in the 1700s, "Well, my theory says it so I have to accept it." Jeremy Bentham was talking about homosexuality, and he said, "According to my theory, I guess, it's all right," even though according to him, it was really weird and gross, he had to accept it.

1:05:57 SC: Well, I agree with what the consequences are. Sometimes when our moral theories give us highly counter-intuitive or weird sounding suggestions, we need to say, "Well, maybe I have a wrong moral theory," right?

1:06:15 DL: Well, that's actually something that Rawls thought. So Rawls agreed that we need to go through this process called reflective equilibrium, where we sort of tune our own intuitions to the theories that we are developing. However, that's where I think I would diverge from Rawls. I would say that, look, if there's a matter-of-fact about which actions create more cooperative behavior than others, then just like if the doctor tells me to stop smoking and I really, really want to, I have to say, "Well, look, it's a matter of fact, which actions are right and wrong or which actions are healthy and unhealthy." But I could still say, I don't want to do that, or even I'm not going to do that. However, there's still a fact of the matter about what the right thing to do is.

1:07:03 SC: Right. But we do need to make some choice about whether or not we've discovered that fact of the matter through our moral theorizing or whether or not we should be a little bit less confident. That you've chosen a function, Utilitarian chooses a function over all utility, which is to say, add it all up, and maximize it. And you, in some sense, chosen a function over utility, which is to say, "Look at the utility of the worst off person, and maximize that." Right? And maybe there's some happy medium, so that the 50 people don't get their legs broken.

1:07:37 DL: I don't see how that would work, although I'm open to thinking about it. So there are a lot of people who want to have hybrid versions between these two, but once again, the problem is, if you wanna mix the two theories together I think you need to have a third theory that tells you, when do you take the utilitarian choice, when do you take the contractarian choice. And I just don't know what that third theory would look like.

1:08:00 SC: Well, you say that now, when the 50 people are suing you because they all have their legs broken, you might feel differently.

[chuckle]

1:08:05 DL: Oh no, don't say that. Yeah, actually someone of the conference recently, joked to me that if I'm wildly successful beyond my dreams and this actually was used in self-driving cars, I could be responsible for millions of injuries and deaths, and I laughed because I know that that's not going to happen, but there was a part of me that sort of was a little bit afraid. I mean, I know that the car companies need to work out some kind of solution to this, and the problem is they're not talking about what they're doing.

1:08:40 SC: Yeah. You could equally well say that if you succeed beyond your wildest dreams, you'll be responsible for saving enormous amounts of death and suffering in the world, right?

1:08:48 DL: Sure, sure. Let me ask you, if you don't mind, I assume you're sort of taking the more utilitarian approach here. I got from what you were saying that generally you take a sort of utilitarian, although utilitarian constructivist from your Big Picture book, but still more or less good old fashion utilitarian approach to most kinds of decisions like this.

1:09:11 SC: Actually no, I'm just trying to give you a hard time, because I'm just trying to figure out what the right thing to do is. I don't really have strong substantive moral theory myself, I don't believe in utilitarianism, because I totally buy the utility monster kind of responses or the repugnant conclusion. Derek Parfit had this very similar argument that it would always be better just to have more children, just have more and more killed kids, because there can be more and more people having happiness. And I'm extraordinarily skeptical of the idea that we can, number one; calculate individual utility for people, maybe that's possible. But then number two; add them up on some commensurable scale, seems like the wrong thing to do, to me. So I'm almost to the point where I'm willing to accept some kind of deontology rather than some kind of consequentialist way of thinking, but I'm not quite sure what that would be.

1:10:07 DL: That's interesting. So you mentioned two objections to utilitarianism. One of them, is that it doesn't match your intuitions on some weird cases. And the other one is that it's just very, very hard to implement, it's hard to calculate pleasures and pains.

1:10:20 SC: Right.

1:10:20 DL: Now if I'm a utilitarian, I might say as to the first one, "Well, so much the worse for your intuitions." And in addition, I might point out... Now, I'm being a utilitarian, for some reason, by the way.

1:10:30 SC: Yeah. [chuckle]

1:10:31 DL: But I might also point out that any moral theory is gonna say really, really counter-intuitive thing. So I'm not sure why we should care if there's a crazy sounding scenario with an alien where it doesn't seem to match our intuitions. Do you expect that there's going to be a moral theory that matches all of your intuitions at some point?

1:10:51 SC: No, but I actually do buy Rawls' point on reflective equilibrium, because as a moral anti-realist. I think that we're getting our starting point for morality are our moral intuitions. As a cognitive realist, I understand that those intuitions might be incoherent and therefore there's work for moral philosophy to do in... From our moral intuitions building them into the best fit, sensible, logical coherent system. But yeah, I think it's evidence when the system that I've tried to build is wildly in conflict with the intuitions I started with, that might be either because I gotta get rid of that intuition or because I did a bad job building a system, I'm open to both possibilities.

1:11:37 DL: I think that's fair, I think that's fair. I think my main concern about using intuitions and evaluating the theory, is that I'm so aware of the history of strong intuitions that have been false.

1:11:51 SC: Sure.

1:11:51 DL: That I just give them virtually no evidential weight whatsoever. In addition to that, there's a lot of intuitions I have right now that I suspect almost any moral theory is gonna tell me it's wrong. Like I said, I love the taste of meat. But any plausible moral theory is gonna tell me that if I can lead a happy, healthy life without eating meat that I really should. I love driving fossil fuel vehicles, I love it.

[chuckle]

1:12:20 DL: But most moral theories tell me that if that has terrible effects on the environment, and I really don't need to be doing that. I live in a city. I have public transport. So, again, most moral theories are gonna tell me things I really don't wanna hear. And I'm very sensitive.

1:12:34 SC: Yeah, no. I definetely agree we have to be open to throwing out this or that moral intuition, or at least dramatically changing it, and I think this is what makes human beings pretty cool, is that we don't only have our moral intuitions. They're where we start, but we also have our rational cognitive capabilities, and we can... There's feedback, we can go from rationality to alter our moral feelings, and it could happen, like I'm a meat-eater and I drive a fossil fuel car. I wanna get rid of the fossil fuel car but I'm not gonna get rid of eating to meat, but I'd be happier if we could make artificial meat, and wouldn't have to kill any animals to do it.

1:13:15 DL: Sure, I could see that. And the problem in appealing to pure or rational sort of corrections here is that if there's nothing outside of our intuitions that we're appealing to, to correct them, then I'm not on sure how we escape the inevitability of an internally coherent system that's just completely mistaken.

1:13:39 SC: Yeah, I don't necessarily believe that the word mistaken has any reference there in the world, I think that...

1:13:44 DL: Yeah, I can understand that. I think then we're just coming to blows about whether we think the function of moral theories is to produce cooperative behavior among self-interested organisms or whether it's to produce sort of satisfying solutions according to our contingent intuitions that we all happen to share, or some of us happen to share.

1:14:05 SC: Yeah. I think that in both senses, we're trying to be coherent and rational either individually or collectively. It's interesting to me is that people have strong disagreements about moral realism versus anti-realism, and those agreements are almost entirely uncorrelated with their ideas about what actually is and is not moral. [chuckle]

1:14:31 DL: That is really fascinating to me too. I find that in talking to most sort of well-educated people in my friend circles, that they are explicitly moral relativists, but implicitly utilitarians.

1:14:47 SC: Interesting. Yeah.

1:14:47 DL: Yeah. Well, usually, like you said, there's sort of good utilitarians, where they're willing to sacrifice one person to save many, but then they also don't want to sacrifice cappuccinos and fossil fuel and eating meat and so on. And so if you push them far enough, maybe they'll admit that explicitly, but then they might fall back on relativism and say, "Well it's all relative", or something like that, so they're relative when it's convenient.

1:15:11 SC: Yeah, for me, utilitarianism is an example of something that I reason to myself out of, as far as I'm concerned. I think that it sounds superficially the right thing to do, but I think the objections to it are good enough that I'm looking for something better.

1:15:24 DL: Yeah. Just in case you're curious, I was gonna bring this in, I have a survey that was conducted by the website philpapers.org, run by David Chalmers and his group, and they asked professional philosophers. Do you accept the category of normative ethics described as consequentialism, deontology or virtue ethics.

1:15:48 SC: Right.

1:15:48 DL: And it's roughly split, as 25% accept or lean deontology, 23% and some change accept or lean towards consequentialism, 18% towards virtue ethics and 32.3% other. And this actually reminds me of a poll that you took, and you described in one of your blog posts about your survey of interpretations of quantum mechanics.

1:16:15 SC: Yeah, not a lot of consensus.

1:16:17 DL: Yeah, not a lot of consensus. And you called this a huge embarrassment for the field of physics. And I kind of feel that way about my own field in some ways. I must admit, I feel like it is a little embarrassing that these are not just theories that are sort of fun to think about, but they actually make a difference in how we live and how we design artificial intelligence, and it turns out that there is not a lot of consensus in the field where there should be.

1:16:49 SC: Do you have the numbers there for moral realism versus anti-realism?

1:16:53 DL: I actually might, yeah. Hold on a second.

1:16:55 SC: 'Cause that was also a philpapers survey question. I remember that.

1:16:58 DL: Yes, I do.

1:17:00 SC: And I think that most philosophers are realists, right?

1:17:01 DL: That's right, 56.4, accept or lean towards realism, 27.7, anti, and then 15% other.

1:17:09 SC: Alright. It is interesting, I think it's more embarrassing that we don't have a consensus on quantum mechanics, 'cause quantum mechanics should be easier than ethics or morality, but it's more important that we don't have a consensus on ethics or morality.

1:17:25 DL: Right. That's where the analogy might end, is that most of the versions of quantum mechanics, if I understand it, make essentially similar predictions. However, the moral theories, although one could say in 99% of cases, most of the moral theories probably make the same predictions. It's just these rare scenarios of, especially involving say opportunity cost, what you could be doing instead of what you're doing right now, that there's the biggest and most important disagreements.

1:17:55 SC: Sure. And if you're in that crowd of 50 people who's going at their legs broken, it's extremely relevant to you that...

[chuckle]

1:18:00 DL: Yeah, don't sue me.

1:18:01 SC: Your car is this programmed one way or the other. Just to wrap it up, put a bow on it. I guess we glossed over a little bit about the implement ability of this plan. You sketched out a database idea where we would have all these different possibilities. How real world is this prospect of making contractarianism the way that our self-driving cars go about making moral decisions.

1:18:27 DL: Yeah, that's a terrific question and my answer is, I don't know. But if you are working on this kind of technology out there, I would love to hear from you. I want to know how plausible is it to be able to design autonomous vehicles and other autonomous systems that can quantify the effects of these actions on primary goods and then run Maximin functions over them. I've talked to people in the field who say, "Well, it seems like this might be plausible." I see my job as saying; if we are going to design autonomous systems, here's what they need to be capable of doing. And if they are not capable of doing this, then we should slow down or maybe even halt the development of this technology.

1:19:13 SC: Right, right.

1:19:14 DL: And I think that is especially relevant in the domain of autonomous weapons systems.

1:19:18 SC: Well, good. Here are my final two questions, which could be short answers or longer, but one of them is; you brought up an issue in the book that, again, I was surprised because I hadn't even thought of it. Is it a problem if an artificially intelligent system does things that seem to be ethical to us, but it can't articulate why it's doing them? And this is an issue for deep learning systems, where they can recognize a picture but it can't tell us why it recognized a picture. A human being would be able to articulate an answer that might not be the correct reason why they did something, but at least they can try. Should we expect the same from AI?

1:20:02 DL: That's a great question, and the answer is, I'm not sure. There is a Kantian position here, that says that it's not a real decision; getting back to the very first thing we talked about, it's not a decision you're responsible for, unless you can actually articulate the reasons for it. You can tell me why you did it. Otherwise, you're just sort of an animal or a child acting on instinct. Now, to me, it doesn't so much matter if you can articulate it, what matters is are you following the maximin principle.

1:20:34 SC: Right.

1:20:35 DL: And I think the best way of doing this is actually constructing the maximin principle in these autonomous systems, in what's called a top-down approach. However, I'm also open to the possibility of what you might call approximating a maximin principle, through these more bottom-up methods. If the machine learning system produced outputs that always matched the maximin principle in the kinds of cases we observe, and we had good reason for thinking that it would continue to run this program that approximated maximin in future cases. Then I would say that would be, let's say close to good enough or sufficient in that case.

1:21:17 SC: I think so. I think it's a little bit too much to demand of our AI systems that they be articulate moral philosophers, [chuckle] as long as they seem to be doing mostly the right things.

1:21:27 DL: Right. Well, as long as it says something like; the reason why I chose this path instead of this path is that the worst collision in this path is better than the worst collision in that path. It doesn't need to say something like; I traveled into the original position and I realized from there, that maximin was actually...

[chuckle]

1:21:45 SC: Yeah. That'd be too much to ask. And the other final question was something you already alluded to, a potential difference between the every day life circumstance of a car driving around and trying to avoid accidents and the every day but not everyone's life case of people at war, or machines that were intentionally built in order to inflict harm in a certain way. How do the moral considerations change? And I realize this is a huge topic, but maybe a simple introduction to the differences between that and every day life in wartime.

1:22:21 DL: Well, the smallest case is that this might be applied in right now are what you could call security robots.

1:22:28 SC: Right.

1:22:28 DL: And in fact, these are currently being used in some airports in China and other places in East Asia, where they have, in some cases, taser technology equipped with them. And so there are good things about this kind of technology. And in fact, if someone is harming another person, it is good if a robot could step in and actually neutralize the threat. But the problem is, in doing so, it needs to be capable of identifying when there is a threat, what kind of threat this is and what the proportional amount of force is to neutralize that threat. And so, contractarianism does make predictions about this. If you could quantify the kind of harm being done by that threatening agent, that enemy agent, and you could say, usually that neutralizing the threat is better than just say killing the agent, because that would be certainly making the agent now worst off.

1:23:28 SC: Yeah.

1:23:29 DL: But neutralizing the threat would be the best of all possible outcomes. And so you could imagine security robots, and in the extreme, military robots being designed with their goal of neutralizing enemies and neutralizing threats, because I think that would be the maximin approach to it.

1:23:50 SC: Right. But maximin seems to... Maybe I'm just not conceptulalizing it correctly, but it seems to fail us a little bit when literally our goal is to kill people.

1:24:01 DL: That's right. And I think that you could imagine cases, and I do imagine cases where the ideal autonomous robot in war would be commanded to kill an enemy soldier, and the robot would say, "No thank you, but I am going to apprehend him and take him into prison." In fact, that's the goal. Going back to our good friend of Immanuel Kant, he famously and shockingly said, "If God commands you to kill your own children, the correct response is, "No, I'm not going to do that." And we tell people in military ethics and the ethics of war, that if your commanding officer tells you to kill innocent people in war, the correct answer is, "No, but I am happy to do other things that are not war crimes."

1:24:51 SC: Do I remember correctly that in the book you suggested or at least wondered out loud whether or not it might be okay to ultimately have autonomous self-driving cars or drones and so forth, but not in the theater of war, that autonomy should be always in the hands of human agents who actually can take responsibility.

1:25:14 DL: That's right. There was a letter that was recently signed by a number of people who work on ethics and political philosophy, and these group of people were arguing that autonomous systems should not be used in war at all. Now, I wouldn't go that far, but I would agree that the kinds of capabilities they would require in order to be... I don't wanna use the word responsible, but in order to make the right choices in war are unlikely to happen any time soon. And so all of my claims are a big hypothetical, which is if we are going to design these kinds of machines, these are the kinds of capabilities that they would require. And I'm willing to do that for military robots as well with more skepticism that they are actually going to achieve this level of sophistication than in the case of medical technology or transportation technology.

1:26:17 SC: It's very interesting to me because I see a philosophical version of what happens in physics, in particular, in science more generally, where concepts that we could have ignored at earlier times are forced to the forefront of our attention by the progress of technology, right? And so I think it's a wonderful thing for philosophy that our discussions about morality are being sharpened a little bit by the fact that we can't be wishy-washy, we can't be fuzzy about them, we can't just say be excellent to each other, we need to tell machines that will listen to us quite literally, how to behave in a wide variety of circumstances.

1:26:58 DL: Yeah. And I have to admit a friend of mine pushed me on this position that I take, he said, it was kind of bullshit what I'm doing, because if I was taking a strong moral stance against autonomous weapon systems... Can I say bullshit on your program?

1:27:13 SC: Absolutely.

1:27:14 DL: Okay. So that it's kind of bullshit, and that I am being hypocritical or I'm not really caring about the use of this technology. That in fact, I'm just saying, "If you're going to build it, here's the right way of doing it." And I'm sensitive to that objection, I'm very sensitive to it. I'm not convinced that autonomous weapon systems or autonomous vehicles or autonomous medical care bots are actually a good idea in the long run. And I'm sensitive to the fact that maybe this position I'm taking is a little bit too corporate. But that being said, if any corporations would like to pay me large amounts of money, I am more than available.

1:28:00 SC: Very good. Well, I do hope they take you up on that. I'm certainly on your side in thinking that this is something where we should face up to their problems, rather than ignore them. Derek Leben, thanks so much for being on the podcast.

1:28:14 DL: Thank you, Sean.

[music]

14 thoughts on “Episode 30: Derek Leben on Ethics for Robots and Artificial Intelligences”

  1. 0:43:11 SC: If I understand it correctly, the Nash equilibrium is one where one person cannot unilaterally change to get a better outcome without hurting somebody else, but a Pareto improvement would be where, if we all change at once, we will all be better off, is that right?

    0:43:29 DL: That’s exactly right, […]

    This is not “exactly right”. It has nothing to do with hurting other parties, that’s irrelevant. You are in a Nash Equilibrium when any change to your strategy you can enact will be counter to your interests (payoff function). In the prisoner dilemma game you defect because it doesn’t matter what your adversary does, you still benefit – either he defects too in which case you avoid a long sentence or he stays quiet in which case you get the lowest sentence. To do otherwise it to guarantee you cannot get the lowest sentence and open yourself to the possibility that you may get the highest.

  2. I see the value in thinking deeply about programing machines so that there are ethical guidelines to prevent unnecessary harm. However, when I am driving my car, if I recognize that a collision is about to occur, my instinct will be self preservation. I think that car manufacturers will recognize this human tendency and will present these to the public as designed to maximize personal safety. It is doubtful that buyers will trade their own autonomy for a machine that might have a different priority.

  3. How many times a year does a human driver have to think if she should run into a bicyclist wearing a helmet or her friend who isn’t in order to swerve to avoid a 90 year man who just walked in front of her?

    99.999% of accidents have nothing to do with the trolley car dilemma yet this was about 1/3 of the podcast.

  4. @alex

    > In the prisoner dilemma game you defect because it doesn’t matter what your adversary does, you still benefit

    > either he defects too in which case you avoid a long sentence or he stays quiet in which case you get the lowest sentence.

    That is inaccurate. In most versions of the prisoner’s dilemma (when repeated over many iterations), the Nash equilibrium is both players defecting, which leaves both players worse off than both cooperating. Any strategy other than defecting leaves the other side open to exploiting your strategy and becoming much better off. The dilemma describes a non-pareto optimal Nash equilibrium.

  5. Machines deciding where to extract natural resources to sell them also effects people in a similar ethical dilemma, there are many other illustrative examples of the problem that are not self driving cars. That is just a popular topic at this time as autonomous vehicles are beginning to affect our lives.

  6. Lol… the trolley square problem is an analogy for many problems AIs will face–it’s not specifically about running over people, or dogs, or flowers, or whatever… beep boop

  7. But people aren’t constantly running into the trolley problem when they drive so why would a driverless car? In the extremely rare instance when this might happen, it isn’t as if the human driver usually has time to think of the action to take between two bad outcomes so why would we expect the driverless car to do better?

    I liked the rest of the podcast, though.

  8. In the discussion of various philosophical theories for the programing of robots, it is important to realize that they do not understand anything. It is not possible to program them to understand. Therefore, we need to restrict their functions to areas where understanding is not required. Perhaps understanding is not required in the driving of automobiles or semitrailer trucks on crowded roads and neighborhood streets, but we need to make sure.

  9. atheist4thecause

    I’m not a big fan of the absolutism of what actions are and aren’t harm. If one was trying to create an algorithm of harm, they would have to consider the impact of the action as well the impact on society. Stealing $50 from the President, for instance, isn’t as likely to harm the President as a uneducated poor person trying to take a train to work. At the same time, the President is much more valuable to society than the uneducated poor person.

    Now, we’d have to find a way to mesh what I call the Moral Value and Ethical Value. In my above example, despite the President being probably millions of times more beneficial to society, the harm of losing $50 would be basically 0, and so the uneducated poor worker, who would be harmed greatly by not being able to get to work, would end up being harmed more in the Combined Value.

    Another issue about the absolutism about what is being talked about is with the Trolley Problem, when we switch the track over we don’t actually know that is going to kill someone. Heck, we don’t even know pushing someone in front of a trolley is going to kill them. Our actions all lead to a probability someone is harmed or killed. You would likely have to take the probability of harm, weigh the extensiveness of it, and mesh that into the Combined Value as well.

  10. Sean,
    You state at [1:09:11] that you do not fully buy into utilitarian arguments, in part, due to the “Utility Monster” and “Repugnant Conclusion” criticisms. I believe both of these responses have been addressed within utilitarian circles.

    Utility Monster:
    I argue that this criticism is not a bug, but a feature of utilitarianism working correctly.

    In the “Utility Monster” thought experiment, we are asked to:
    1) Imagine an individual capable of generating greater utility than thousands of other individuals combined
    2) Observe that utilitarianism demands we devote significant resources to pleasing that one, even if this comes at great expense to the many

    We are left to conclude that 2 demonstrates a flaw in utilitarianism.

    The true issue is that limits in human imaginations result in many people struggling to accomplish 1. We often fail to envision a human who could act as a utility monster in relation to 10,000 other humans. We are left wondering “What makes them so special?” This is a flaw that is smuggled into the framing of the thought experiment.

    In situations where the utility monster truly is accepted as being capable of generating greater utility than the many, the results are not so unintuitive.
    -A single human is a utility monster in relation to 10,000 ants.
    -A single ant is a utility monster in relation to 10,000 bacteria.
    -A single bacterium is a utility monster in relation to 10,000 grains of sand.

    These conclusions are not nearly so controversial.

  11. Repugnant Conclusion:
    Parfit’s Repugnant Conclusion presents a very strong argument against attempting to calculate utility through “mere addition.” The implication is that a massive group of miserable people is greater than a small group of people who are very content.

    For this reason, I am more supportive of calculating utility using a mean (arithmetic, or perhaps geometric). *As a side note, I personally appreciate that this approach does not imply that adding more people is always desirable (as advocated by former guest, Tyler Cowen).

    Upon applying his reasoning to this model, Parfit reasonably points out that a very small number of extremely high utility people (possibly just one person) would be preferable to billions of people who produce even slightly less utility on average. Again, the discomfort generated by this conclusion is due to lapses in imagination. We are unable to easily imagine a world where a single individual could reliable generate higher average utility than a large society. Even if we could, there would be meaningful moral implications for any plan to transition from our current situation, containing billions of moral agents down to very few.

    A frequent criticism of averaging utility functions is the claim that “average utility could be increased by killing everyone who is generating low utility.” There is a very reasonable response to this criticism.
    A moral community includes all people ever to exist, be they in the past, present, or future. Even if committing genocide could lead to greater average utility amongst those who remained, the utility of those who were killed must still be factored in. One would be unlikely to achieve a net gain in average utility after accounted for the negative utility generated during the mass slaughter.

  12. I’m interested in alternative utility aggregation functions beyond utilitarian (sum of utility) and maximin. It seems like there should be a large number of such functions to consider, some which may have better tradeoffs – better alignment with intuitive preferable choices. One which I like is sum(sqrt(u)). This seems to strike a good balance to me, preferring to increase the utility of the lowest, but also providing some (reduced) guidance for all population members.

  13. My feeling is that pragmatically these AIs will seek to optimize legality and minimize liability – with no consideration of ethics. In some ways, that’s a less interesting discussion and just defers the problem to the legislative process. It also opens questions about whose liability should be optimized? The system operator/owner? manufacturer?

Comments are closed.

Scroll to Top