Subjective decision theory concerns norms about three concepts: values of options, preferences between pairs of options, and choices out of sets of options. A common assumption, at least implicitly, is that norms about preferences are prior to norms on values and choices. One way to put this assumption, following Amartya Sen ([1970] 2017), is that choice functions are binary; they are grounded in binary relations of preference and indifference.
I’m going to argue against this for two reasons. First, preferences, being binary comparisons, don’t provide a rich enough base to ground all the norms. Sometimes we need to take as primitive comparisons the chooser (hereafter, Chooser) makes between larger sets of options. Second, preference is an ex post notion, in a sense to be made clearer starting in Section 15, while choiceworthiness is an ex ante notion. And ex ante norms are not grounded in ex post ones.
1 Choice and Choiceworthiness
Choice gets less attention in philosophical decision theory than one might expect. The focus is usually on either value (e.g., this has value 17 and that has value 12) or preference (e.g., this is preferable to that). Norms on choice are almost an afterthought in standard presentations. After a long discussion of values, preferences, or both, the typical theorist breezily says that the norm is to choose the most valuable or most preferred option.
There is a long tradition in economics, going back to Paul Samuelson (1938) and Herman Chernoff (1954), of taking choice to be primary. Some of this literature rested on largely behaviourist or positivist assumptions. It was better to theorise with and about choice because it was observable, unlike preferences or values. The picture was not dissimilar to this recently expressed view:
Standard economics does not address mental processes and, as a result, economic abstractions are typically not appropriate for describing them. (Gul and Pesendorfer 2008, 24)
That’s not going to be my approach here. I’m going to start not with observable choice dispositions, like the economists, or with choice frequencies, as psychologists like Luce (1959) do, but with judgments about choiceworthiness. In familiar terminology1, I’m taking a mentalist approach not a behaviourist approach. Much of the formal work on choice theory has been done by theorists from the behaviourist side, and I’ll be inevitably drawing from them. But the most important source I’ll be using is someone much more sympathetic to mentalism: Amartya Sen. In particular, I’ll draw heavily on his “Collective Choice and Social Welfare” (Sen [1970] 2017), and also on the literature that grew out of that book.
1 See, for instance, Hansson and Grüne-Yanoff (2024), where I learned about the Gul and Pesendorfer quote
I’m not going to take a stance on the metaphysics of choiceworthiness judgments. I’ll sometimes talk as if they are beliefs, but if someone wanted to have a sharp belief-desire split, and hold that choiceworthiness, like preference, involves an interplay of the two, I wouldn’t object. The real assumption is that the mental state being ascribed in choiceworthiness ascriptions is the same kind of state as that being ascribed in preference ascriptions. The biggest difference is that choiceworthiness is a relation to an arbitrarily sized set of options, while preference is a relation to a pair of options.
2 Values First?
Both to clarify what kind of question I’m asking, and to set aside one kind of answer, I’m going to start looking not at preferences or choices, but at numerical values. At first glance, it might seem that many decision theorists take values to ground the other norms. One should prefer the more valuable. So we get theorists discussing Newcomb’s Problem largely by offering theories about the value of uncertain outcomes (like taking both boxes) in terms of the values and probabilities of those outcomes.
On second glance, though, there are at least four reasons it is implausible that values are really what ground the other attitudes.
First, it’s surprising to have a numerical measure like this not have a unit. We’ll sometimes say things like this outcome has value 17 utils, but this is a placeholder, not a real unit like kilograms or volts. This is related to the second reason.
The orthodox view is that these values are only defined up to a positive affine transformation. If it’s appropriate to represent Chooser with value function v, it’s appropriate to represent them with any value function f where f(o) = av(o) + b, for positive a. Why is this transformation allowed? Because all the values are doing is reflecting Chooser’s preferences over outcomes and lotteries, and this transformation doesn’t change those preferences. In other words, the transformation is allowed, and the definedness claim is true, because values are grounded in comparatives.
Third, it’s not at all obvious why values should be anything like numbers. Indeed, the thought that they should be numbers starts causing problems when we get to various puzzles about infinite goods.2 Why should values have the topology of the reals rather than any number of other possible topologies? Why aren’t they, for example, quintuples of rational numbers ordered lexically? If one takes preferences to be primary, and generates utility functions via representation theorems, as in Ramsey (1926) or Neumann and Morgenstern (1944), there is a reason for this. But if values are primitive, it seems like an unanswered and I’d say unanswerable question.
Finally, there is something very strange about the idea of values that are not in any way comparative. How valuable something is just seems like it should be a notion that reflects the value of alternatives to it.
The argument of this section will not be, I suspect, particularly contentious. It’s a widespread view, if often implicit, that that values, and norms on values, are ultimately grounded in comparatives. What is going to be contentious is the claim that preferences can’t do the job, and judgments of, and norms about, choiceworthiness ultimately ground both values and preferences.
3 Coherence
There is one other striking thing about the picture we get in Neumann and Morgenstern (1944), and which is still I think broadly endorsed by contemporary decision theorists. The aim is to put norms on preference, and hence on values and choices. But these norms are almost always defined in terms of other preferences. For example, if one strictly prefers x to y, and y to z, one should prefer x to z. It is strange to talk about grounding the normative facts about preference when other preferences play such a crucial role. It looks like the grounding relation will be at least cyclic, and possibly intransitive.
If there’s a puzzle here, there are two (related) ways out. One is to take the view, which perhaps Hume held, that the only constraints on preference are coherence constraints. Another is to say that while there might be some non-coherence constraints on preference, e.g., it is in fact wrong to prefer the world’s destruction to one’s finger getting scratched, these are part of a separate subject to decision theory. The result is the same: decision theory largely is about what it takes for various preferences to cohere with one another.
I don’t particularly agree with this picture, but I’m going to accept half of it for the purposes of this paper. That is, I’ll assume it’s not the role of decision theory to criticise the person who prefers the destruction of the world to the scratching of their finger. That person either violates no norms, or violates a different kind of norm to the norms of decision theory. Decision theory, on the latter view, takes Chooser’s preferences over ends as given and judges Chooser on how well their instrumental preferences serve these preferences over ends.
I’ll argue that even if decision theory is about coherence, it should be about coherence between choiceworthiness judgments. So the big question is not whether preferences should satisfy transitivity or independence, but whether their choiceworthiness judgments should satisfy the conditions described in Section 5. As we’ll see, the questions about which conditions are genuine coherence constraints on choiceworthiness is tied up with the metaphysical question about the priority of preferences and choices.
4 Sen on Preference
Sen starts with a binary relation R, defined over options, which is glossed as xRy means that x is at least as good as y. This is potentially misleading. It does not mean that x is either better than y, or exactly as good as y. Rather, it means that x is no worse than y. Sen introduces two more binary relations, xPy, meaning x is preferred to y, and xIy, meaning the chooser is indifferent between x and y. These can both be defined in terms of R, as in (1) and (2). (Throughout, I’m leaving off wide scope universal quantifiers over free variables.)
- xPy ↔︎ (xRy ∧ ¬yRx)
- xIy ↔︎ (xRy ∧ yRx)
A more common way of doing things in contemporary philosophy is to start with P and a fourth relation E, where xEy means that x and y are equally good. On this picture, both (1) and (2) are true, but the explanatory direction in (1) is right-to-left. So xRy just is ¬xPy, and then xIy is still defined via (2). On the version Sen uses, it’s a little trickier to define E, but (3) looks like a plausible conjecture.
- xEy ↔︎ [(xRz ↔︎ yRz) ∧ (zRx ↔︎ zRy)]
That is, two options are equally good iff they are substitutable in other preference relations. Given all these results, we can show that the following claims are all tightly connected.
- xPy ∨ xEy ∨ yPx
- (xPy ∧ yIz) → xPz
- (xIy ∧ yIz) → xIz
(4) is what Ruth Chang (2017) calls the trichotomy thesis. (5) is what Sen calls PI-transitivity, and (6) is what he calls II-transitivity.
Sen makes very few assumptions about R, but it will simplify our discussion to start introducing some assumptions here.3 We’ll assume that R is reflexive, everything is at least as good as itself, and that P is transitive. (Sen calls this quasi-transitivity.) Sen ([1970] 2017, 66) notes that if P is transitive and R is ‘complete’ in the sense of the next paragraph, then (5) and (6) are equivalent. It’s also easy to show that given (3) plus these assumptions, (4) and (6) are equivalent.4
3 He makes few assumptions because he was interested in exploring what assumptions about preference are crucial to the impossibility theorem that Arrow (1951) derives. He initially noticed that without (6), Arrow’s theorem didn’t go through. This turned out to be less significant than it seemed, because Allan Gibbard (2014) proved that a very similar theorem can be proven even without (6). See Sen (1969) for the original optimism that this might lead to an interesting way out of the Arrovian results, and Sen ([1970] 2017) for a more pessimistic assessment in light of Gibbard’s result. Sen reports that Gibbard originally proved his result in a term paper for a seminar at Harvard in 1969 that he co-taught with Arrow and Rawls. Much of what I’m saying in this paper can be connected in various ways to the literature on Arrow’s impossibility theorem, but I won’t draw out those connections here.
4 Proof: Assume (4) is false. So the right hand side of (3) is false. Without much loss of generality, assume that xRz ∧ ¬yRz; the other cases all go much the same way. So all the disjuncts are false. From ¬xPy and ¬yPx we get yRx ∧ xRy, i.e., xIy. And xRz implies zIx. So we have a counterexample to II-transitivity, since zIx and xIy, but since ¬yRz, yIz is false. So if (4) is false, (6) is false. In the other direction, assume we have a counterexample to (6), i.e., xIy and yIz but not xIz. From xIy we immediately get that the two outer disjuncts of (4) are false. From yIz we get yRz and zRy. So if xEy, (3) implies that xRz and zRx, i.e., xIz. But we assumed that ¬xIz. So all three disjuntcs of (4) are false. That is, if (6) fails, so does (4), completing the proof that they are equivalent.
What should we call the principle (4)? Most philosophers call it completeness, and its denial incompleteness. In his economic work, Sen ([1970] 2017) uses the term ‘completeness’ for a different property of preference relations, namely xRy ∨ yRx. This is a useful notion to have. If Chooser has never thought of x, there is a good sense in which xRy ∨ yRx is false, even though of course ¬yPx is true. Still, using the term for (4) is more common in philosophy. When writing primarily for philosophers, Sen (2004) uses ‘completeness’ for (4), and I’ll do the same. I’ll call use definedness for xRy ∨ yRx, and unless stated otherwise, will assume it holds.
There isn’t as much discussion of (4) as such in the economics literature, but there is a long tradition of discussing (6), going back to important works by Wallace E. Armstrong (1939, 1948, 1950). In most of those works it is assumed that P is transitive, so (4) and (6) are equivalent, so this is really discussing the same thing. Still, it makes the terminology confusing.
When it makes it clearer, I’ll use the term Chang (2017) suggests for (4). That is, I’ll say that preference relations for which this holds are trichotomous.
5 Properties of Choice Functions
In philosophy we’re familiar enough with possible properties of preference relations, e.g., that they are transitive, reflexive, acyclic, etc, that these terms don’t need to be defined. We’re mostly less familiar with properties of choice functions. So in this short section I’ll lay out six properties that will be important in what follows. The first four are discussed in some detail by Sen ([1970] 2017), and I’ll use his terminology for them. The fifth is due to Aizerman and Malishevski (1981), and is usually named after Aizerman. The sixth is discussed by Blair et al. (1976).
- Property α
- (x ∈ C(S) ∧ x ∈ T ∧ T ⊆ S) → x ∈ C(T)
That is, if x is choiceworthy in a larger set, then it is choiceworthy in any smaller set it is a member of. This is sometimes called the Chernoff condition, after Herman Chernoff (1954), and sometimes called contraction consistency.
- Property β
- (x ∈ C(T) ∧ y ∈ C(T) ∧ T ⊆ S) → (x ∈ C(S) ↔︎ y ∈ C(S))
That is, if x and y are both choiceworthy in a smaller set, then in any larger set they are either both choiceworthy or neither is. Intuitively, if x and y are ever both choiceworthy, then anything better than x is also better than y.
- Property γ
- (x ∈ C(S) ∧ x ∈ C(T)) → (x ∈ C(S ∪ T))
That is, if x is choiceworthy in two sets, it is choiceworthy in their union. This is sometimes called expansion, e.g., in Moulin (1985).
- Property δ
- (x ∈ C(T) ∧ y ∈ C(T) ∧ T ⊆ S) → ({y} ≠ C(S))
This is a weakening of β. It says that if x and y are both choiceworthy in the smaller set, then after options are added, it can’t be that only one of them is the only choiceworthy option remaining. If x is not choiceworthy in the larger set, that’s because some other option, not y, is chosen in place of it.
- Property Aiz
- (C(S) ⊆ T ∧ T ⊆ S) → C(T) ⊆ C(S)
That is, if the smaller set contains all of the choiceworthy members of the larger set, then no option is choiceworthy in the smaller set but not the larger set. If x is a unchoiceworthy member of S, then the only way to make it choiceworthy is by deleting choiceworthy members of S, not unchoiceworthy ones.
- Path Independence
- C(S ∪ T) = C(C(S) ∪ C(T))
The same options are choiceworthy from a union of two sets as are choiceworthy from the union of the choiceworthy members of those sets. This is a sort of independence of irrelevant alternatives principle; the availability or otherwise of unchoiceworthy members of S and T doesn’t affect what should be chosen from S ∪ T.
I’ll describe the effects of these properties in more detail in subsequent sections.
6 Property α
This is the most commonly used constraint on choice functions, and it does seem intuitive. If x is choiceworthy from a larger set, deleting unchosen options shouldn’t make it choiceworthy. Sen ([1970] 2017, 323–26) discusses two possible counterexamples.
One is where the presence of options on the menu gives Chooser relevant information. If the only two options are tea with a particular friend or staying home, Chooser will take tea. But if the option of cocaine with that friend is added, Chooser will stay home. The natural thing to say here is that when one gets new information, C changes, so there isn’t really a single C here which violates α.5
5 For a quick argument for that, if Chooser learns the only options are tea and staying home because the friend has just run out of cocaine, they might still stay home.
The more interesting case is where the value Chooser puts on options is dependent on what options are available. So imagine Chooser prefers more cake to less, but does not want to take the last slice. If the available options are zero slices or one slice of cake, Chooser will take zero. But if two slices of cake is an option, Chooser will take one, again violating α.
This is a trickier case, and the natural thing to say is that Chooser doesn’t really have the same options in the two cases. Taking the last slice of cake isn’t the same thing as taking one slice when two are available. But this move has costs. In particular, it makes it hard to say that C should be defined for any set of options. It doesn’t clearly make sense to ask Chooser to pick between taking one slice, which is the last, and taking three slices when five are available.
Still, I’m going to set those issues aside and assume, like most theorists do, and mostly assume that α is a constraint on coherent choice functions, and that choice functions are defined over arbitrary sets of options.
7 Assumptions
I’ve said a few times I’m assuming this or that, so it’s a good time to put in one place the assumptions I’m making. These aren’t intended to stack the deck in my favour; if any of them are false, I think it makes the view that choices are not binary (a) more plausible, but (b) harder to state. Anyway, here’s what has been assumed.
- P is transitive, i.e., xRy ∧ yRz → xRz
- R is ‘defined’, i.e., xRy ∨ yRx.
- R is reflexive, i.e., xRx.
- C is non-empty, i.e., C(S) ≠ ∅.
- C is defined everywhere, i.e., there is a universe of options U all subsets of U are in the domain of C.
- C satisfies α.
- The universe U of options, that S is a subset of, and x is drawn from, is finite.
In Section 6 we saw one reason to reject 5, namely that we might want to individuate options in terms of what else is available. The cases we’ll discuss in Section 16 provide another, but rather than explore that, I’ll follow standard practice and assume 5 throughout this paper.
When R satisfies 1-3, I’ll follow Luce (1956) and call it a semiorder. When it also satisfies trichotomy, i.e., (4), I’ll call it a weak order.
8 Defining binariness
With these seven assumptions on board, it’s easy to state what it is for a choice function to be binary.
First, we’ll define an inversion function B (for binary) that maps preference relations to choice functions, and vice-versa. Both of these are sets of ordered pairs, and we’ll define the ordered pairs directly. I’ll assume that there is a universe U of options, and every option and set of options is drawn from it.
If the input to B is a preference relation R:
- B(R) = {⟨S, x⟩: ∀y(y ∈ S → xRy)}
That is, B(R) is the choice function which for any set S selects what Sen calls ‘maximal’ members, those members to which nothing is strictly preferred.6
6 Hansson (2009) calls this the ‘liberal maximisation’ rule. He contrasts it with five other rules, which are distinct in general but equivalent given R is a semiorder.
If the input to B is a choice function C:
- B(C) = {⟨x, y⟩: x ∈ C({x, y})}
That is, B(R) is the preference relation which says that in any choices from pair sets, an element is chosen only if it is at least as good as the other member. Sen ([1970] 2017, 319) calls this the ‘base relation’ as opposed to a more complicated ‘revealed preference relation’, and notes that the two are equivalent given α. Since we’re assuming α, we’ll use the simpler version.
A choice function C is binary iff (9) holds:7
7 Sen calls these functions ‘basic binary’, but the distinction he’s drawing attention to by adding ‘basic’ doesn’t make a difference given R is a semi-order and α.
- C = B(B(C))
That is, if you convert C into a preference relation, and back into a choice function, you get the same thing back.
The core claim of this paper is that there are coherent choice functions which are not binary. A related claim is that a plausible pair of coherence constraints that you can state using B do not in fact hold. The constraints are that if C and R are an agent’s choice function and preference relation, then C = B(R), and R = B(C).
9 Property β
If we start with choice functions, the definition of E in (3) is too simple. A better definition is in (10).
- xEy ↔︎ ∀S({x, y} ⊆ S → (x ∈ C(S) ↔︎ y ∈ C(S)))
That is, x and y are equal iff one is never chosen when the other is not.8 Given this notion of equality, there is an intuitive gloss on β: Two options are both choiceworthy iff they are equal.9
8 Without α, this is too weak, since it doesn’t entail that x and y are intersubstitutable in general. But we won’t worry about that.
9 This gloss also assumes α.
To see this, think about choice functions that are defined by starting with numerical value functions, e.g., expected utility, and saying that the choiceworthy options are those with maximal value. If x and y are both choiceworthy in any set, they must have the same value. That means in any set where either is choiceworthy, i.e., either has maximal value, they both have maximal value, so both are choiceworthy.
More generally, given the assumptions we’re making, C satisfies β iff B(C) is trichotomous, which is equivalent to B(C) satisfying II-transitivity. Unsurprisingly, the two historically significant cases of intuitive counterexamples to II-transitivity also generate intuitive counterexamples to β.
The first example involves distinct but indistinguishable options.10 Assume that Chooser prefers more sugar in their coffee to less, but can only tell two options apart if they differ by 10 grains of sugar or more. Now consider these three options:
10 The idea that humans can’t distinguish similar options is important in Fechner (1860), a work which is discussed in Beiser (2024). The earliest connection I’ve found between this and indifference being intransitive is in Armstrong (1939). Armstrong’s example is rather confusing; the one I’ll use here is based on Luce (1956).
x = Coffee with 100 grains of sugar.
y = Coffee with 106 grains of sugar.
z = Coffee with 112 grains of sugar.
This is said to be a counterexample to II-transitivity because Chooser is indifferent between x and y, and between y and z, but strictly prefers z to x. It’s also a counterexample to β. Chooser would choose either from x and y, but when z is added, y is still choiceworthy but z is not.
This example was historically important, but it’s not discussed that much in the contemporary philosophical literature. It could be because philosophers were convinced by the argument in Fara (2001) that phenomenal indistinguishability is in fact transitive. But it did get widely discussed in economics, especially once it started being discussed by R. Duncan Luce (1956, 1959), who used these examples to argue that preferences form a semi-order.
The cases that were more important in the philosophical literature are what Chang (1997) call ‘small improvement’ cases. The earliest case I know of fitting this form is from Luce and Raiffa (1957).11 (In this, P(x,y) is the probability that Chooser will select x when x and y are both available.)
Suppose that a and b are two alternatives of roughly comparable value to some person, e.g., trips from New York City to Paris and to Rome. Let c be alternative a plus $20 and d be alternative b plus $20. Clearly, in general P(a, c) = 0 and P(b, d) = 0. It also seems perfectly plausible that for some people P(b, c) > 0 and P(a, d) > 0, in which event a and b are not comparable, and so axiom 2 [i.e., (4)] is violated. (Luce and Raiffa 1957, 375)
An example with the same structure, involving a boy, a bicycle, and a bell, is discussed in Lehrer and Wagner (1985), and mistakenly attributed to Armstrong (1939).12
12 Many authors subsequently made the same attribution; if you want to see some examples, search for the word ‘pony’ among the citations of Armstrong’s paper on Google Scholar.
The usual way these cases are discussed, starting with Luce and Raiffa, is that they violate a certain kind of comparability. For example, Luce and Raiffa say there is a sense in which the two holidays are ‘not comparable’. I want to resist this reading. The core intuition in small improvement cases is that β fails. Chooser would choose either option from {a, b}, but if a+ is added as an option, a becomes unchoiceworthy. If we add the assumption that R = B(C), then it does follow that trichotomy fails, and there is a sense in which they are incomparable. But without that assumption, it’s consistent to say that these are counterexamples to β but not to trichotomy. We’ll return to this point in Section 18.
10 Properties γ and δ
Assume R does not satisfy trichotomy, but is a semiorder, and C = B(R). Then β will fail, but γ and δ will hold. Conversely, for any C where γ and δ hold, there is a semiorder R such that C = B(R) (Sen [1970] 2017, 320). We’re not going to be very interested in δ, but we will be very interested in γ.
The reason γ holds when R is a semiorder and C = B(R) is instructive. If x is choiceworthy among S, then nothing in S is better than x. Similarly, if x is choiceworthy among T, then nothing in T is better than x. So nothing in S ∪ T is better than x. So x is choiceworthy among S ∪ T.
Conversely, if there are cases where C should not satisfy γ, then we’ll have an argument that C should not be based in some semiorder R. Showing that there are such cases will be one of the main tasks of the rest of this paper.
We had two kinds of counterexamples to β, but only one of them will be relevant here. I don’t think there are any intuitive counterexamples to γ that start with Fechner-style reflections on the intransitivity of indifference. But there are going to be variations on the bicycle and pony example that generate intuitive counterexamples to γ. We’ll come back to these in Section 19.
It is common to say that when C = B(R) for some semiorder R, that C is rationalizable, and when R is a partial order, that C is rationalizable by a partial order. I find this terminology tendentious - why should semiorders be the only things that can make C rational? And as we’ll see in Section 13, it conflicts with the notion of a choice being rationalizable in game theory. But it’s a common enough terminology that I wanted to mention it here.
11 Aizerman’s Property
The property Aiz is not particularly intuitive. Fortunately, it turns out to be equivalent, given our assumptions, to one that is: Path-Independence.13 That principle says that to find what’s choiceworthy from a union of sets, you only have to consider which options are choiceworthy in the smaller sets.
13 Unless stated otherwise, the results in this section, along with proofs, can be found in the helpful survey by Hervé Moulin (1985).
Note that this isn’t just saying that the options choiceworthy from the union are choiceworthy from one of the members. That is implied by α. What it is saying is that whether the unchosen options from S and T are or aren’t on the menu doesn’t make a difference to which options are choiceworthy from the union.
There is a very natural kind of model where α and Path-Independence holds, but β and γ do not. Let O be a set of total orderings of U. (A total ordering is a relation R such that xRy ∨ yRx ∨ x=y.) Then C(S) is the set of Pareto-optimal members of S relative to those orderings. That is, it is the members of S such that no other member of S is better according to every member of the set of orderings.
In fact, it turns out the converse of this is also true. If C satisfies α and Aiz, then there is some set of total orderings such that C(S) is the set of Pareto optimal members of S according to that set.
It might seem strange after all the talk of weak orderings and semiorders that we’re now using total orders. Given that U is finite, this turns out to be unsurprising. Any semiorder, and hence any weak order, is such that there is some set of total orders such that x is strictly preferred to y in the semiorder iff it is preferred in all the total orders. (In fact we can put a sharp cap on how many such orders there must be.) So being Pareto optimal relative to some set of semiorders (or weak orders) is equivalent to being Pareto optimal relative to a larger set of total orders.
If C is determined by a set of orders in this way, it is said to be pseudorationalizable. These choice functions are not always binary. Consider one simple example, where U is {x, y, z}, C(U) = {x, z}, and for any other S, C(S) = S. That is the choice function determined by the pair of orderings x > y > z and z > y > x. This satisfies α, δ and Aiz, but not β or γ. And it isn’t binary. B(C) is the universal relation R, since whenever S is a pair set, C(S) = S.
I’m going to argue over subsequent sections that this is a coherent choice function, and hence not all coherent choice functions are binary.
12 Preference and Trade
Let’s go back to why we might have wanted to focus on binary preference relations. One natural reason is that our primary aim in looking at preferene is to explain trade, and it is natural to give preferences a central role in explaining trade. If Chooser trades a cow for some magic beans, it’s natural to explain that by saying Chooser preferred the magic beans.
When we are looking to explain trade in a non-monetary economy, that seems like a good reason to give preferences a central role in the story. But very little trade these days involves barter like this. Most trade involves money. Money is primarily valuable instrumentally. If Chooser buys some shoes for $100, we could say that Chooser prefers the shoes to the money. But that doesn’t seem like the end of the story, since there is a reason why Chooser values the money as they do. The deeper point is that the money Chooser has gives them a budget constraint, and Chooser really thinks that the shoes are a better use of the $100 than anything else available. That is, it seems more informative to describe Chooser as choosing the shoes from the menu of things that cost $100 than to describe them as preferring the shoes to the money, since it’s the fact about choice that explains why they have that preference. In general, choiceworthiness seems more relevant to explaining market behaviour in monetary economies than preference, unless choiceworthiness can be defined in terms of preference. So let’s turn to some reasons to think that it cannot.
13 Degenerate Games
Say a two-player game is degenerate iff the payoffs to one of the players are constant in all outcomes of the game. For convenience, say Column is the player with a constant outcome. So Table 1, Table 2 and Table 3 are degenerate games.
Left | Right | |
---|---|---|
Up | 10,0 | 0,0 |
Middle | 1,0 | 1,0 |
Left | Right | |
---|---|---|
Middle | 1,0 | 1,0 |
Down | 0,0 | 10,0 |
Left | Right | |
---|---|---|
Up | 10,0 | 0,0 |
Middle | 1,0 | 1,0 |
Down | 0,0 | 10,0 |
Start with the following two assumptions, which seem fairly plausible for strategic games like these.
- If a move is part of a Nash equilibrium, it is choiceworthy.
- A move is choiceworthy only if there is some probability distribution over the other player’s moves such that the move maximises expected utility given that distribution.14
14 The notion of a rationalizable choice, in the sense of Bernheim (1984) and Pearce (1984), slightly strengthens this. A choice is rationalizable iff it maximises expected utility relative to a probability assignment that only gives positive probability to the other player, or players, making rationalizable choices. That’s circular as stated, but one can remove the circularity at the cost of making the definition somewhat less intuitive.
15 Since rationalizability is between these notions, it also coincides with them for degenerate games.
In degenerate games, these necessary and sufficient conditions for choiceworthiness coincide, but in general they are rather different.15
In Table 1 and Table 2, both options are choiceworthy by this standard. Middle-Right is a Nash equilibrium in Table 1, and Middle-Left is a Nash equilibrium in Table 2. But in Table 3, the only choiceworthy options are Up and Down. Whatever probability Row assigns to Left/Right, Middle will not maximise expected utility. So this is a counterexample to γ. Middle is choiceworthy from {Up, Middle} and from {Middle, Down}, but not from their union.
It follows immediately from Lemma 3 in Pearce (1984) that in degenerate games, a choice satisfies those conditions for being choiceworthy iff it is not strictly dominated, where this includes being dominated by mixed strategies. (In Table 3, Middle is strictly dominated by the 50/50 mixture of Up and Down.) This means that the choices will satisfy Path-Independence. An option is not dominated by the options in S ∪ T iff it is not dominated by the undominated options in S ∪ T, i.e., by the options in C(S) ∪ C(T). So removing options that are not choiceworthy in S and T from the union doesn’t change what is undominated, i.e., choiceworthy.
This last paragraph is the start of a pattern in the examples that follow. Although I’ll be arguing against γ, I won’t be arguing against Aiz/Path-Independence. I’m not going to offer anything like a conclusive argument for Aiz, but the pattern suggests that it is the right constraint to add to α to be a fundamental constraint on coherent choiceworthiness.
14 Choice Under Uncertainty
Luce and Raiffa (1957) discuss what they call choices under ‘uncertainty’, by which they mean choices where Chooser cannot assign probabilities to the states. Peterson (2017) calls these choices under ‘ignorance’. None of the proposed decisive rules for choice under uncertainty/ignorance are particularly compelling; all lead to very strange outcomes.
The best approach, in my opinion, is to treat these choices like degenerate games. Indeed, degenerate games are really a paradigm of choice under ignorance; Row has no reason to assign any particular probability to Column’s choice. Further, what the game theory textbooks say about degenerate games seems fairly plausible; any undominated option is choiceworthy. The same goes for choices under ignorance; any undominated option is choiceworthy.
If that’s right, then the three examples from Section 13 can be repurposed as examples of choice under uncertainty, replacing Left and Right with p and ¬p, and the same analysis will hold. Again γ will fail, because Middle is choiceworthy when there is one other option, but not when they are two. So there’s no binary comparison of Middle with the other two options that explains the facts about what is choiceworthy in the three cases.
15 Multiple Equilibria
This is a decision theory paper, so we need to introduce a demon who can reliably predict Chooser’s choices. We’ll start with a version of what Skyrms (1982) calls ‘Nice Demon’. In Table 4, Chooser selects Up or Down, and Demon either predicts Up (PU), or predicts Down (PD). Whatever Chooser does, Demon is very likely to have predicted correctly.
PU | PD | |
---|---|---|
Up | 6 | 0 |
Down | 0 | 4 |
Jack Spencer (2023) argues against views, like that defended by Dmitri Gallow (2020), which say only Up is choiceworthy in Table 4. His argument relies on a simple principle. If Chooser plans to play Down, then Chooser knows Down will have the best return, and it’s not irrational to make the choice one knows will have the best return. Gallow (2024) argues that knowledge might not be a strong enough state to make this principle work, because of high stakes gambles. I think knowledge should play the role in practical reasoning Spencer assigns it (Weatherson (2024)), so I’ll assume Spencer is right here, and both Up and Down are choice-worthy in Table 4. Indeed generally in any problem with a nice demon, any option that would be best were it chosen, is choiceworthy for just the reason Spencer gives.
Now add a third option, Exit, which has a guaranteed return of 1. So the game table looks like Table 5.
PU | PD | |
---|---|---|
Up | 6 | 0 |
Exit | 1 | 1 |
Down | 0 | 4 |
In Table 5, Exit is not choiceworthy. Whatever credences Chooser has about what Demon has done, it is better in expectation to choose one of Up or Exit. But if either Up or Down were unavailable, Exit would be choiceworthy. So again this case is a counterexample to γ.
There is an important general lesson from this case. What makes an option choiceworthy in cases like this is that it is utility maximising once it is chosen. We’ll turn next to a more dramatic illustration of this point.
16 Mixed Strategies
Demon has stopped being nice, and now wants to play Rock-Paper-Scissors with Chooser. Given Demon’s powers, this could go badly for Chooser. Happily, Chooser can choose to randomise16, and all Demon can predict is the probability that Chooser’s random process will come up with any option. So the best thing for Chooser to do is to pick one of the three options at random. That alone will maximise expected utility conditional on being chosen.
16 I’m not going to argue for this here, but I think it is part of being ideally practically rational that one is able to randomise, just like it is part of being ideally practically rational that one can make calculations costlessly.
Not coincidentally, the only Nash equilibrium of Rock-Paper-Scissors is for each player to randomise. Nash equilibria are arguably the only sensible strategies if one assumes that every other player has Demon-like abilities to detect what one is doing. But it’s a long running puzzle in game theory how it can be uniquely rational to randomise. Why can choosing randomly be better than choosing one of the things one is randomising between? To turn this rhetorical question into an argument, note that the following three principles are inconsistent.
- Randomising is the only choiceworthy strategy in Rock-Paper-Scissors.
- If only one choice is choiceworthy, it is rationally preferred to all other choices.
- It is irrational to prefer a random mixture of some choices to every one of the choices.
Since (11) is a standard view in game theory, (12) is a standard view in choice theory, and (13) is a standard view in preference-based decision theory, it is a little disconcerting to see they are inconsistent.
The example in Section 15 shows how to steer through this trilemma. Choiceworthiness is fundamentally an ex ante notion, and preference is fundamentally an ex post notion. The reason Spencer’s view about Table 4 is right is not that the rational chooser is indifferent between Up and Down. It’s that they don’t have preferences between them until they have chosen, and once they choose, they prefer the choice.
Similarly in Rock-Paper-Scissors (especially against a Demon), what’s true is that prior to deciding, the only rational choice is to randomise. Before the choice, there is no such thing as Chooser’s preferences over the options; what I earlier called definedness fails. Once one has chosen, one should be strictly indifferent between the options; they each have the same expected utility. In particular, one should not prefer to re-randomise rather than put into effect the result of the random process.
So we should reject (12) in its most natural interpretation. Randomising in Rock-Paper-Scissors is the only choiceworthy option, but until a choice is made, Chooser simply shouldn’t have preferences over these options.
Most of the arguments in this paper against the binariness of choice turn on counterexamples to γ, but this is a distinct argument. Sometimes, as in Rock-Paper-Scissors, there are grounds for rational choice, but no grounds for rational preference. The only preferences that would ground the choice would violate (13). So rational choiceworthiness cannot be grounded in rational preference.
This is the deepest reason why C = B(R) must be wrong; C and R are fundamentally about different kinds of attitudes. C is about what is rational prior to making a choice, R is a constraint that must be satisfied once a choice is made. Outside of Newcomb-like cases, this distinction won’t often matter, but it is another reason that the equation fails.
17 Multiple Attributes and Decisiveness
Sartre (1946/2007) has a famous example of a young man, we’ll call him Pierre, caught between two imperatives. The actual example Sartre gives is complicated in interesting ways, but we’ll work with a very simple version of the story. Pierre lives in occupied France during WWII, and feels torn between his duty to care for his ailing mother, and his duty to fight for his occupied country. What should he do?
The case is horribly underdescribed, but the following verdicts have seemed plausible to many people. First, Pierre can rationally, and morally, choose what to do here. These are both noble impulses, and it’s fine to follow other. Second, and this follows from the first and the fact the case is underdescribed, the options are not equally good. After all, a small improvement to either would not break a tie between them.
A third intuition is more contentious, but perhaps plausible: there is something wrong about Pierre going back and forth between the two choices; he should make a choice and stick to it. On this view, there is something intrinsically good about settling on a choice and sticking to it. What makes this intuition less than fully clear is that in any practical version of the case, it will be very bad for Pierre to oscillate between the two views. He could spend the whole war travelling between England and France as he changes his mind on where he should be, and that would be bad. The intuition I have in mind is that there is something good about taking a stance and committing to it, even outside of the practical costs of changing one’s mind.17
17 As Moss (2015) points out, it is less clear this is intrinsically bad if there is more time between the reconsiderations; it makes more sense to change one’s mind than to rapidly flip-flop. Conversely, if Pierre resembles the young Thomas Schelling (as discussed by Holton (1999)), firmly committing to one plan and then another over the course of successive nights, he’s doing something wrong even if it has no practical downsides.
A very simple model for these intuitions is that Pierre’s situation is surprisingly like the person playing Table 4. There are two options here, and either is acceptable, but once one is chosen, it becomes the preferred one. There are two good values here, caring for family and caring for country, and Pierre’s fundamental choice is to adopt one of these as his value. As Chang (2024) puts it, he chooses to put ‘his whole self’ behind one of those values. Once that choice is made, his preferences and his actions follow naturally.
It is easy enough to reject the third intuition. Perhaps Pierre could rationally, as Chang (2024) puts it, drift into one choice. Perhaps he could adopt one path and sometime later rationally regret his choice, because the other value strikes his later self as more important. Still, if one holds all three intuitions, toy models like Table 4 capture a surprising amount of what’s going on with Pierre. And those models are incompatible with choiceworthiness being binary.
18 β and Incompleteness
The Pierre case in Section 17 is, even without the third intuition, a straightforward counterexample to β. Just to spell it out,
x = Help mother
y = Fight Nazis
z = Help mother plus one extra ration book
If the choices are x and y, either is acceptable. If the choices are x, y and z, x alone is unchoiceworthy. So β fails.
This is the Small Improvement argument, and it is often thought to be an argument against completeness, i.e., (4). The point of this section is to say why that argument might fail, even if the argument against β works.
Imagine someone was convinced by the arguments in Dorr, Nebel, and Zuehl (2023) that (4) must be true. Their arguments turn on semantic properties of comparatives; since R is a comparative, they say it must be complete since all comparatives are complete.18 Now it would be very strange if the semantics of comparatives in natural language entailed that these intuitions about β had to be false. And in fact these claims about semantics do not entail that.19
18 For what it’s worth, I think this argument fails because of the case of ‘stronger’ in logic. They address this case, but I don’t think their response works. But it would be a huge digression to follow that thread through.
The following view seems to me to be coherent, and if so it shows why Dorr et al’s view of comparatives does not support β.
- β fails in cases like Pierre’s.
- An option is choiceworthy for Pierre iff no option is determinately preferred to it.
- To make a choice, Pierre must determine which value is really his.
- Once he does that, his preferences will satisfy (4).
- Before he does that there are two possibilities. One is that his preferences aren’t even defined over these options, so asking which is preferred is like asking whether the number 7 is taller, shorter, or the same height as justice. Another is that it is vague what Pierre’s preferences are, but any resolution of the vagueness makes (4) true. This latter option fits nicely with the idea that C should satisfy Aiz, since it should be determined by a set of orderings, each of them the possible precisifications, or determinations, of his current state.20
20 This view of preference mirrors the view of credence defended by Carr (2020).
One objection to this view is that it seems to imply that Pierre could rationally choose one option while he prefers, but does not determinately prefer, another. But this isn’t what the view implies. Once Pierre makes the choice, he must, if he is rational, determine that his preferences match it. Preference, as I argued in Section 16, is fundamentally an ex post notion. That is, (14) is true, while (15) is false.
- Once Pierre chooses an option, there must not be some other option he strictly prefers to it.
- Before an option is choiceworthy for Pierre, it must be at least as good (given his preferences) as any other option.
What (14) says is that if Pierre is rational, he will choose what he prefers. That does not mean, as (15) claims, that rational choices are those that are preferred prior to the choice. Choice, on this picture, is prior to preference, both analytically and, in this case, causally.
When I say this is coherent, I don’t mean to half-heartedly say that it is correct. My preferred view is that Pierre could rationally drift (in Chang’s sense) into either option, and if he does, (4) would fail even ex post. All I mean to argue here is that the case against β doesn’t turn on this view, and one can coherently reject β while endorsing trichotomy.
19 Bad Compromises
My version of the Pierre example was very simple, but it allows for some interesting complications. As stated, you might think Pierre isn’t thinking through his choices well enough. He should join the local resistance, so he can stay close enough to his mother to help, while also fighting the Nazis.
But maybe that’s a terrible option. We can easily imagine that the resistance is either so useless that it does practically nothing, or so good at recruiting that it has little useful work. It’s just as easy to imagine that it creates busywork that dramatically reduces how much he can care for his mother, while not doing much to help the war effort. At risk of trivialising the issue, we can imagine that Pierre’s options look like this, where the two columns represent how much each option respects/promotes the relevant value.
Care for mother | Fight Nazis | |
---|---|---|
Stay Home | 10 | 0 |
Join resistance | 1 | 1 |
Join Free French | 0 | 10 |
Intuitively, if these are Pierre’s options he should not join the resistance; it’s almost the worst of both worlds. We don’t want to rest on an intuition, and I’ll argue for it in the next two sections. But if it’s right, the case is another counterexample to γ. Consider these two variants on the case.
No exit: While Pierre is deliberating, he hears that the options for getting to the Free French have been decisively cut off. (In the original, he worries this might happen.) Now his only two options are to stay home, or to join the resistance.
Promise: Pierre’s brother Jean is fighting the Nazis. Pierre has promised Jean that if Jean is killed, Pierre will take up the fight in some way. Sadly, Jean is killed, and Pierre regards this promise as binding. Now his only options are to join the resistance, or join the Free French.
In either case, the resistance seems choiceworthy. In both cases, it is the option that does best of the remaining choices on one of the criteria. Pierre could decide he endorses that criteria as his own, and acts accordingly. So the resistance is choiceworthy amongst either pair of options, but not amongst their union.
20 Levi and Sen
In Hard Choices, Isaac Levi (1986) defended a view where the choiceworthy options are only those that maximise value on some resolution of the incompleteness in the agent’s values. Levi also had views about what further constraints there should be on choice, so he did not defend the view I’ve been discussing where any option that is maximal on any resolution is choiceworthy. But his views are still relevant here, because this requirement that a choice be maximal on a resolution meant that he was committed to γ failing, and choices not being binary.
A central example which he uses, and which Sen (2004) picks up on, involves an executive looking to hire a secretary. I’ll follow Sen and call the executive Ms. Jones.21 She is looking for a secretary with good typing skills and good stenography skills. (This is the 1980s.) We’ll conceive of these skills, a bit arbitrarily, as distinct values. There are three candidates: Jack, Danny, and Luke, and their value on each measure is given in Table 7. (I’ve slightly adjusted the numbers to match the earlier examples.)
21 I’ve also changed the secretaries’ names.
Typing Skill | Stenography Skill | |
---|---|---|
Jack | 10 | 0 |
Danny | 1 | 1 |
Luke | 0 | 10 |
Levi argues that if the numbers are like this, and Danny is barely better than the other two on each metric, he is not choiceworthy. That’s true even though he might be choiceworthy if one or the other candidates was unavailable.
Sen argues that the right choice theory in this case should be “inarticulate”, and say that any of the three is choiceworthy. He responds to the intuition Levi presents with a dilemma.
On the first horn, we understand these numbers as representing something an objective measure of the skills of the candidates at each of the tasks. As Sen points out, it’s easy to imagine situations where someone who is not abysmal at either half of the job is more useful than someone who is an expert on one, and abysmal on the other.
On the other horn, we measure the “importance” (Sen 2004, 53) of each skill for the task at hand. Sen argues, by analogy with the difficulty in establishing a social welfare function out of the welfare of each individual, that there will be no way to do this. Let’s turn to how we might go about it.
21 Lotteries, Choices, and Values
In this section I’ll offer a response, on Levi’s behalf, to Sen’s dilemma. The example will draw heavily on recent work by Harvey Lederman (forthcoming).22 We’ll start by imagining that Ms. Jones might have a choice not of secretaries, but of agencies, and she has a reasonable credal distribution over the skills of the people a particular agency might send. That is, each choice of agency will be a choice of a lottery, where she doesn’t choose a package of skills, but a probability distribution over some outcomes, where each outcome is a secretary with a numerical skill on each attribute.
22 See also Lederman (2023), and Tarsney, Lederman, and Spears (forthcoming). This is not to say Lederman would endorse anything like this response; as we’ll see in the next section he is sympathetic to Sen’s view. But every principle I’ll use here is discussed, one way or another, in his work, and I’ve drawn heavily on that discussion in what follows.
To set this up, I’ll need three new bits of notation. I’ll write Lxy for a lottery that has equal chance of returning outcomes x and y, where these might be secretaries or further lotteries. I’ll write x = y for what I’ve previously written as xEy. It means that x and y are equally good by Ms. Jones’s lights. It’s perhaps suboptimal to introduce new notation for an old concept, but the mix of L’s and E’s became hard to read. Finally, I’ll write ⟨x, y⟩ for a secretary with skill x at typing and skill y at stenography.
If the numbers in Table 7 measure importance, then Ms. Jones’s preferences should satisfy (a special case of) what Lederman calls Unidimensional Expectations.
- Unidimensional Expectations (UE)
- L⟨x1, y⟩⟨x2, y⟩ = ⟨(x1 + x2)/2, y⟩
- L⟨x, y1⟩⟨x, y2⟩ = ⟨x, (y1 + y2)/2⟩
That is, the value of a lottery where the possible outcomes agree on one dimension also has the same value on that dimension, and has the expected value of the other dimension. If the lottery does not involve value conflict, old fashioned expected value maximisation is the way to go.
This is enough to rule out the example Sen has in mind on the first horn, where a secretary with skill 1 on either dimension is more than half as valuable as a secretary with skill 10. But it’s not enough to say that Ms. Jones should not hire Danny. Unidimensional Expectations is consistent with one resolution of his indeterminate value being that the value of ⟨x,y⟩ is xy. If that’s a permissible resolution, then there will be a resolution on which Danny is maximally valuable. But there are further principles that do rule out Danny. The most intuitive argument I know uses the following.
- Substitution of Identicals (SI)
- If x = y, then Lxz = Lyz and Lzx = Lzy.
- No Trade-Off (NT)
- L⟨x, x⟩⟨y, y⟩ = ⟨(x + y)/2,(x + y)/2
- Rearrangement of Outcomes (RO)
- L(Lxy)(Lvw) = L(Lxw)(Lvy)
- Weak Independence (WI)
- If x = Lxy then x = y
Substitution of Identicals follows naturally from the idea that the outcomes are truly equal, so it doesn’t matter whether one has a lottery has an outcome that results in one or the other. No Trade-Off, like Unidimensional Expectations, says that when we are just considering strictly better and worse options, so there are no relevant complications about resolving indeterminacy, we’re back in the land of expected utility maximisation. Rearrangement of Outcomes follows from the fact that the compound lotteries on either side of the identity sign each have probability 1/4 of returning one of those four outcomes. And Weak Independence, which is probably the most contentious of the lot, says that if y is not exactly as good as x, then a 1/2 chance of y should not be exactly as good as x. Given these principles, we can argue as follows.
1. | ⟨5, 5⟩ = L⟨10, 5⟩⟨0,5⟩ | UD |
2. | ⟨10, 5⟩ = L⟨10, 10⟩⟨10,0⟩ | UD |
3. | ⟨0, 5⟩ = L⟨0, 10⟩⟨0,0⟩ | UD |
4. | ⟨5, 5⟩ = L(L⟨10, 10⟩⟨10,0⟩)⟨0,5⟩ | 1,3 SI |
5. | ⟨5, 5⟩ = L(L⟨10, 10⟩⟨10,0⟩)(L⟨0, 10⟩⟨0,0⟩) | 2, 4 SI |
6. | ⟨5, 5⟩ = L(L⟨10, 10⟩⟨0,0⟩)(L⟨0, 10⟩⟨10,0⟩) | 5 RO |
7. | ⟨5, 5⟩ = L⟨10, 10⟩⟨0,0⟩ | NT |
8. | ⟨5, 5⟩ = L⟨5, 5⟩(L⟨0, 10⟩⟨10,0⟩) | 6, 7 SI |
9. | ⟨5, 5⟩ = L⟨0, 10⟩⟨10,0⟩ | 8 WI |
To complete the argument we just need the plausible principles that (a) x is not choiceworthy from S when x is not choiceworthy from the set consisting of x and Lyz, for y, z in S, and (b) x is not choiceworthy from S when an option that is strictly better on every dimension is in S.
There are several assumptions here, and any one of them could be the subject of a whole paper. But they each seem very plausible, and they show how we can ground the intuition that Levi started with in a sensible theory of preferences over lotteries involving payoffs that differ in multiple dimensions.
22 Negative Dominance
Harvey Lederman (forthcoming) notes that the picture I’ve developed, where Danny is unchoiceworthy, has the following strange consequence. It violates what he calls Negative Dominance. Lederman gives a few versions of this principle, the following is the version most relevant to the case.23
23 In all the quotes I’ll change the names and example to match the one I’m using.
- Negative Dominance (Goodness)
- If one game of chance is better for [Ms. Jones] than another, some prize in the first game is better for her than some prize in the second. (Lederman forthcoming, 13)
The main application of this is to reject the idea that L(Jack)(Luke) is better than Danny. We’ll come back to the fact that the second option here is an outcome as well as a gamble.
This seems like a very plausible principle if (and I think only if) one thinks that the role of decision theory is to come up with coherence constraints on preferences. That being the role follows naturally from the idea that all the notions of decision theory are ultimately grounded in preferences. As Lederman argues, preferences about lotteries have to be grounded in something about their prizes, and if preferences are fundamental, presumably they have to be grounded in preferences over their prizes. The key response I’m making is that if choiceworthiness is prior to preference, this last step doesn’t follow.
So when Lederman says,
a strict preference for one game of chance over another must be explained by a strict preference for one of the prizes of the first, by comparison to one of the prizes of the second (Lederman forthcoming, 18, emphasis in original)
we should question the uses of ‘one’. What’s true is that attitudes towards games of chance should be somehow explained by attitudes towards the prizes, but these attitudes need not be preferences. For instance, in the secretaries example, the preference for the lottery over Danny is sufficiently explained by the fact that Danny is not choiceworthy when all three secretaries are available. More generally, we should accept this principle:
- Negative Dominance for Choices
- If one game of chance is choiceworthy when an option is available and not choiceworthy, that option is not choiceworthy when it and all the prizes of the first lottery are available.
Given α and γ, this will entail Lederman’s version of the principle, at least as restricted to comparisons between lotteries and outcomes. But without them, it does not.
That said, if we adopt a Levi-style view, we cannot generalise Negative Dominance for Choices to any principle explaining comparisons between lotteries. Here’s an example, structurally parallel to the main example in Tarsney, Lederman, and Spears (forthcoming), which shows this. Ms. Jones is now trying to hire a programmer, and she has four candidates, each of which has the skills in the four languages she cares about shown in Table 8.
Python | Java | C | Ruby | |
---|---|---|---|---|
Jane | 6 | 6 | 0 | 0 |
Dolly | 0 | 0 | 6 | 6 |
Lily | 5 | 0 | 5 | 0 |
Suzy | 0 | 5 | 0 | 5 |
When the menu consists of any set of the programmers, all the options are choiceworthy. But Ms. Jones strictly prefers L(Jane)(Dolly) to L(Lily)(Suzy), since the former lottery is better in expectation on all four dimensions. Lederman is right that this needs to be explained, that it should be explained in terms of evaluative features of the prizes (i.e., the programmers), and if the explanation uses expected value, we should explain why expected value matters. No explanation in terms of the choiceworthiness of some options will work, and hence no explanation in terms of the choiceworthiness of options from pair sets (i.e., preferences) will work.
The fact to be explained is that when L(Jane)(Dolly) and L(Lily)(Suzy), only the former is choiceworthy. Here’s how I explain it:
- Ms. Jones has four values, and it is indeterminate how they should be balanced. This means both that she hasn’t decided how to balance them, and maybe it is unnecessary, or even inadvisable, to balance them.
- Given Unidimensional Expectations and No Trade-Offs (as discussed in Section 21), the permissible only trade-offs are linear mixtures of the values.
- Given the result from Pearce (1984) discussed in Section 13 (his Lemma 3), a lottery is best on no linear resolution of the indecision in point 1 iff some available lottery over other choices is better in expectation on every value.
- A lottery is choiceworthy from a menu of other lotteries (or options) iff it is optimal on some permissible resolution of this indecision.
If L(Lily)(Suzy) were choiceworthy, by 4 it would have to be best on some resolution of Ms. Jones’s values. By 1 and 2, this means that it is best on some linear mixture of these values. By 3, that means it is better in expectation on one of these values. But it is not; on all four dimensions L(Jane)(Dolly) has expected value 3, and L(Lily)(Suzy) has expected value 2.5. Even though Ms. Jones has not resolved the indeterminacy in her values, the fact that any resolution would mean she prefers the first lottery is enough reason to prefer the first lottery.
In short, the focus on expected values comes not from any particular importance on expectations as such, but from the thought that permissible reactions to indeterminacy in values are constrained by permissible reactions to resolutions of that indeterminacy, combined with (a) constraints on resolutions like Unidimensional Expectations and No Trade-Offs, and (b) Pearce’s result linking expected value to linear mixtures of values.
23 Conclusion
This paper has been ultimately about the grounding of facts about rational choice. I’ve been mostly concerned to argue against a popular, if largely implicit, view: rational choice is grounded in rational preference. If Chooser wants a holiday, and is choosing where to go, which destinations are rationally choiceworthy is grounded in Chooser’s (rational) preferences over pairs of choices. I’ve rejected this for three reasons:
- As argued in Section 15 and Section 16, choiceworthiness is an ex ante concept, and preference is an ex post concept, and hence choiceworthiness is analytically prior to preference, so should not be grounded in preference.
- When there are no Newcomb-like considerations, and so the ex ante/ex post distinction doesn’t matter, preferences are just choiceworthiness judgments with respect to pair sets, and pair sets aren’t normatively distinctive.
- Any choice function that violates γ cannot be generated from a preference relation, and there are many reasons for endorsing choice functions which violate γ.
If rational choiceworthiness is not grounded in preferences, what is it grounded in? There are two natural options here.
A subjectivist theory says, as flagged in Section 3, that norms on choiceworthiness are just coherence norms. In particular, the key norms are α and Aiz. What makes this choiceworthiness judgment rational just is that it fits well with the other choiceworthiness judgments. There are tricky metaphysical questions about just how instances of coherence norms are grounded, and whether it will mean we have to give up widely accepted principles like that grounding is acyclic. But these questions aren’t different in kind to ones that we’d face on the more mainstream view that decision theory is largely about preferences, and norms on preference are coherence norms.
A more objectivist view is also possible, and I’ll end by just stating it. The world is full of values, many of them. All of these values are orderings (or perhaps semi-orderings) of outcomes. An option is rationally choiceworthy iff it does best, given Chooser’s evidence, on some permissible mixture of some of these values. If all goes well, the values can be measured numerically, and the permissible mixtures are linear mixtures, but as noted in Section 2, we need some good arguments about why they should be numerical. If they are, the argument in Section 21 should generalise to argue that the permissible mixtures are linear mixtures. Whether those last two sentences work or not, the picture is that the rationality of a choice is grounded in something external to the agent, i.e., values in the world.
I’m going to leave arguments about which of these views is correct, or which positions between them might be preferable. All I hoped to do in this section is to sketch ways in which the core thesis of the paper, that the rationality of choices is prior to the rationality of preferences, could be true.