EDIT

Ethics: What is Utility

The Summatarian Framework

So far, we have developed a framework to talk about ethical theories - a framework I've dubbed "summatarianism". The basic premise is that there is "individual wellbeing" and that, for a fixed population, the ethical thing to do is to maximize the expected sum of individual wellbeings. I've argued that all reasonable people should accept it. Among those who do accept it, discussion of ethics is reduced to two questions:

  1. What is "utility"?
  2. How do we account for decisions that change populations?
Having established this framework, we're going to discuss (1) here and (2) in the next post. I should stress that I consider the rest of this chapter much more tenative than the six parts. The first six posts are basically an argument that the vast majority of ideas for ethical systems are at best incomplete and at worst completely incoherent. These last three posts attempt (with mixed success) to pin down a single ethical system.

Individual Utilities

In 2004, Eliezer Yudkowsky offered an idea outside of the utilitarian context that he calls "coherent extrapolated volition" "title": "Coherent Extrapolated Volition". This inspired what I’m about to propose as a theoretical formula for individual utility.

Imagine Alice asks you for the utility you assign to various universe branches. Like any reasonable human, you reply that your mind is neither 100% consistent nor powerful, so you can’t actually do so.

Alice responds that, to aide you, she will let you specify a sequence of steps to take to determine your utility function for you. You might respond

Alice, I want you to make someone as like me as possible, but smarter, wiser, more experienced, and generally more who I wish I was, but whose sole goal is to assign utilities to universes in a way that corresponds maximally with my values. Then ask him/her the same question.

If Alice does so, she may find that this meta-person will provide the same request. And on and on, we have someone who’s near infinite knowledge/intelligence and complete empathy for you allows them to assign utilites to universe branches on your behalf.

I believe that these utilities are the correct ones. This is my definition of well-being.

Practical Implications

Now, I freely admit that this algorithm is impossible to follow in practice, which limits the (um) utility of this definition. However, I think that accepting this as the gold standard helps focus discussion - similar to how Solomonoff induction is impossible to implement in practice, but yields a notion of ideal reasoning "title": "Solomonoff's theory of inductive inference".

However, it'd be silly to simply assert this gold standard is useful without giving examples of its usefulness.

The immediate suggestion from this gold standard is that we should default more-or-less to people's preferences when measuring their utility. After all, of all the people in the world, there is exactly one with access to the inner workings of your mind: yourself.

That's not to say that I think the proper ethical system simply reverts to preference utilitarianism in practice. There are two chief practical distinctions I want to make in that regard.

First, preference utilitarianism is actually pretty vague regarding what counts as a preference. Can I prefer a universe in which lying occurs less often even if average happiness/satisfaction is no higher? Typically consequentalism says no. I say that's a valid preference, because your idealized self could prefer that. If I'm in a simulation that is identical to reality, can I prefer to be in reality even if I'll never know I'm in the simulation? I say that's a valid preferences, because you can prefer that. If you're religious, would you only want to believe in God only if He exists or would you want to want to believe regardless? Either is valid with my specification. In this sense, I'm choosing a very broad definition of preference - choosing one particular interpretation of preference utilitarianism.

The second way this belief system differs from generic preference utilitarianism is more substantial. We know that human cognition has a variety of properties that most people wouldn't want their ideal selves to share - the most obvious example of such properties are cognitive biases.

For instance, we'll talk later about how people don't care about their future selves enough, because we value the next 5 minutes today more than 5 minutes tomorrow. This is something I think most people would prefer not happen when their ideal-self assigns utilities, so this is another example of an important way that my beliefs differ from typical preference utilitarianism. The implications for real world policy are pretty significant.

Works Cited [show]