Chapter 5 Against Rational Possibility

5.1 Four Puzzle Cases

5.1.1 Outguess the Demon

Table 5.1 is a simple case that can be used to show that causal ratificationism implies the existence of dilemmas, cases where there is no action that is ideally rational.

Table 5.1: A simple example of a dilemma.









Assume first, for simplicity, that Chooser can’t play any mixed strategy. Chooser simply has to play U or D, and is arbitrarily confident that Demon will guess their choice correctly. Then Chooser will regret their choice whatever they do. But ideally rational choices are not regretted as soon as they are chosen. So no choice is ideally rational.

The case I’ve used here is a somewhat asymmetric version of the Death in Damascus case from Gibbard and Harper (1978). A slightly more common asymmetric version, as in Richter (1984), would have something like the following table, where there is a benefit to U over D whether Demon is correct or incorrect.

Table 5.2: A version of asymmetric Death in Damascus.









The cases seem to raise the same philosophical issues as far as I can see, and table 5.1 is a little simpler, so I’ll focus on it.

There is, I gather, a widespread intuition amongst philosophers that decision theory should treat U and D asymmetrically in this case. I don’t really share that intuition; they both look like unhappy choices to me. And it isn’t shared by orthodox game theory. When table 5.1 is turned into a game, it becomes 5.3.

Table 5.3: The game version of the simple dilemma.




\(0, 1\)

\(2, 0\)


\(1, 0\)

\(0, 1\)

That game has a unique Nash equilibrium. In equilibrium, Row, or Chooser, plays a 50/50 mix of U and D. The asymmetry in the payouts is reflected in an asymmetry in Column, or Demon’s, equilibrium strategy. Their equilibrium strategy is to play \(PU\) with probability \(\frac{2}{3}\), and \(PD\) with probability \(\frac{1}{3}\). And that seems intuitively right to me; Chooser should treat the options here symmetrically, and if anyone should treat them asymmetrically, it is Demon.

Thinking about the equilibrium shows us how to drop the assumption that Chooser can’t play mixed strategies, while keeping the conclusion that there are dilemmas. In 5.1 the standard setup is that Demon is an arbitrarily good predictor who is an expected utility maximiser. That settles what Demon will do in all but one case: when Chooser plays a 50/50 mix of U and D. Unfortunately, that’s exactly what Chooser will do in equilibrium. So it’s an important oversight. To fill it in, let’s stipulate that Demon will also play a 50/50 mix of \(PU\) and \(PD\) if each of them have the same expected utility. That’s enough to reinstate the dilemma.

Every strategy for Chooser is a mixture7 where U is played with probability \(p\), and D is played with probability \(1-p\). If \(p < 0.5\), then Chooser thinks Demon will play \(PD\), so they should definitely play U, so they regret any strategy other than U for sure. And that means they regret playing this mixture where \(p < 0.5\). If \(p > 0.5\), then Chooser thinks Demon will play \(PU\), so they should definitely play D, so they regret any strategy other than D for sure. And that means they regret playing this mixture where \(p > 0.5\). And if \(p = 0.5\), then Chooser thinks Demon will play a 50/50 mixture of \(PU\) and \(PD\), in which case they think the right strategy is to play U for sure. (This is where the asymmetry in the payoffs matters.) And so they regret the mixed strategy where \(p = 0.5\). Whichever strategy, mixed or pure, that Chooser adopts, they regret it. So no strategy, mixed or pure, is ideally rational.

It’s easy to think that if mixed strategies are allowed, then the fact that these games have equilibria means they must allow strategies that won’t be regretted. The problem with that reasoning is that it assumes more about Demon’s dispositions than is known. It isn’t stipulated that Demon is perfectly rational, or that they are an equilibrium seeker. It is just stipulated that they are an excellent predictor and a utility maximiser. And that isn’t enough to rule out dilemmas, as this example shows.

I’ll have much more to say about mixed strategies as this chapter progresses. But I’ll leave them here, and turn to dilemmas that are not unique to causal ratificationism.

5.1.2 Open Ended Good

Here is a very familiar example of a case where every theory says there is no ideally rational choice.

Open Ended Good
Chooser must pick a real in \([0, 1)\). The higher the number they pick, the greater their reward will be.

For any \(x \in [0, 1)\), if Chooser picks \(x\), they could have instead picked \(\frac{1+x}{2}\), which would have been better. So any pick they make is bad, either after they make it (as matters for causal ratificationism), or before they make it (as matters for most other theories). So it is a dilemma.

It is a dilemma that involves there being infinitely many choices. But it’s hard to see why that should make a philosophical difference. I’ll return to this point in section 5.3, but for now simply note that any argument that dilemmas are impossible on theoretical grounds must say that Open Ended Good is somehow impossible. And it’s hard to see why it would be.

5.1.3 Heaven

One reason someone may worry about Open Ended Good is that it involves discontinuous payouts. The limit payout, as \(x\) tends to 1, of choosing \(x\) is 1. But the payout of choosing 1 is undefined. If we allow unlimited payouts, this discontinuity goes away. Unlimited payouts are sometimes thought to be of dubious coherence due to paradoxes like St. Petersburg, and I’m somewhat sympathetic to this line of thought. But they can be motivated, even in paradoxical situations.

Chooser has died, and gone to face the last judgment. God looks over Chooser’s varied and active life, and is struggling to make a final call. He eventually says, “Look there is so much bad here I can’t just let you into Heaven. But there is so much good that I can’t just send you away either. So here’s what I’ll do. You’ll stay a while here, then off to the other place. I’ve been struggling with thinking about how long you should stay here.” And at this point, Chooser is hoping God picks a very large number. Chooser has had a bit of a look around while God was deciding. This is strictly speaking against the rules, but as we mentioned, Chooser had a varied and active life. Chooser has realised that (a) days in Heaven are good, and the goodness of them does not diminish over time, (b) the other place is considerably less good, and (c) the afterlife is infinite in duration. After a pause, God speaks again. “You know what, you decide. Pick a number, any number, any natural number that is, and you’ll spend that many days here, then off to the other place.” What should Chooser do?

Again, Chooser faces a dilemma. If Chooser picks \(n\), then it would have been considerably better to pick \(2n\), or \(n!\), or \((n!)!\). Whatever choice Chooser makes, will be regretted instantly.

It is worth thinking through this case a bit, to get a feel for what dilemmas are like from the inside. I think there is a common view that decision theory shouldn’t allow dilemmas because it should be practical, that it should offer advice. And saying in cases like table 5.1 that whatever one does, one will regret it, isn’t exactly advice. But in Heaven, that’s obviously the right thing to say. Or, at least, it’s obviously part of the right thing to say. And both of these insights, that everything is regrettable is both correct and incomplete, will be important in what follows.

5.1.4 The Salesman

The last example I’ll discuss in this section is Salesman, the problem from section 1.2.1. The aim is to find as short a path as possible through these cities, assuming that it is possible to travel between any two cities in a straight line.

The 257 cities that must be visited in the Salesman problem (reprise).

Figure 5.1: The 257 cities that must be visited in the Salesman problem (reprise).

This isn’t a strict dilemma; there is a shortest path. But it is going to pattern with the dilemmas in a lot of philosophically important respects. And, like with Heaven, it helps to get clear on how dilemmas work by thinking through how to solve it.

Just about the least thoughtful path possible orders the cities alphabetically. The resulting tour is shown in figure 5.2.

A solution to the salesman problem that orders the cities alphabetically.

Figure 5.2: A solution to the salesman problem that orders the cities alphabetically.

It’s only slightly more thoughtful to order the cities from west-to-east, as in figure 5.3, but it does reduce the length by nearly two-thirds.

A solution to the salesman problem that orders the cities by longtitude.

Figure 5.3: A solution to the salesman problem that orders the cities by longtitude.

Let’s try something slightly more thoughtful. Start in an arbitrary city, I’ll use New York, and then at each step go to the nearest city that isn’t yet on the path. The result looks tour is shown in 5.4.

A solution to the salesman problem that chooses the nearest remaining city.

Figure 5.4: A solution to the salesman problem that chooses the nearest remaining city.

It’s better, but it still looks not so great in a few respects. For one thing, it crosses over itself too many times. For another, those long lines seem like mistakes. And they are somewhat inevitable consequences of this approach. Always choosing the nearest city will mean sometimes the path leaves a region without clearing it, and has to come back. So in this map there is a single step from Wyoming to New Jersey, as the map goes back to clean up the cities near the start that were missed.

Here’s a much better idea, in two stages. For both stages I’ve used the implementation in the TSP package by Michael Hashler and Kurt Hornik, and what follows is from their description (2007).

The first stage uses the farthest-insertion method. The idea is to build the path in stages. At each stage, the algorithm identifies the city that is farthest from the existing path. It then inserts that city into the path at the spot where it will make the least addition to the path length. These kind of insertion algorithms are common. What makes the farthest-insertion algorithm work well is that it forces the path to start with something like a loop around the edges of the map, and this is a generally good approach.

The second stage uses the two-optimization method. This takes a completed path as input, and tries to improve it by seeing if the path would be shortened by flipping adjacent points. It does this repeatedly until it finds a local minimum. One way to turn this into an algorithm is to start with a random path. But what I did here was start with the path that a particular run of the farthest-insertion method generated. The farthest-insertion algorithm needs a start city, or it will choose one randomly. So for reproducibility purposes, I used New York again. And I set the seed for the random number generator in R to 1. And the result was the elegant path shown in figure 5.5.

A solution to the salesman problem that uses two algorithms.

Figure 5.5: A solution to the salesman problem that uses two algorithms.

That map has a lot of virtues. There aren’t any obvious missteps on the map, where it’s easy to see just by looking that it could be shortened. The theory behind it makes sense. It was generated very quickly. It took my computer more time to draw the map than to find the path. And it is very short compared to the other maps we’ve seen.

But it isn’t optimal. I’m not sure what is optimal, but the shortest one I found after trying a few methods is shown in figure 5.6.

The best solution I found to the salesman problem.

Figure 5.6: The best solution I found to the salesman problem.

This used the same idea, but with a bit more brute force. The algorithm was the same two-step approach as the last map. But I had the computer cycle through all the different possible starting cities, and varied the seed for the randomisation. (The algorithm isn’t quite deterministic, since it uses random numbers to break ties when it is asked to find the farthest city, or the shortest insertion.) And setting the start city to Dubuque, Iowa, and the seed to 33, generated that map. Finding these settings took some time, and I’m not sure it is even optimal. It would actually be helpful for the argument I’m about to make if it isn’t optimal, and the shortest path is shorter again. But let’s assume that it, or something like it, is. Because what I want us to focus on is the philosophical status of the tour in figure 5.5. Getting clear on that will be the beginning of a solution to the puzzles about dilemmas.

5.1.5 Characteristics of the Puzzles

  • Multiple standards
  • Stake Sensitivity
  • Might be no available ideal play
  • Especially if there are cognitive limits

5.1.6 Multiple Standards

  • Tempting to have a binary split into ideal/non-ideal
  • The big point of this chapter is that this is a bad way to do things
  • Talk about Salesman at length, including the maps
  • Key point: Not just quantitative differences, but qualitative differences

5.1.7 Stake Sensitivity

  • Whether something is ideal is not stake-sensitive (unless something weird happens - e.g., me on evidence)
  • But whether something is fine is stake-sensitive
  • Example from open ended good
  • Example from Salesman

5.1.8 No Ideals

  • Foot on two kinds of dilemmas
  • The best option might be fine.
  • Weirich 1985 AJP says that it’s absurd to have dilemmas, then says that ideal decision is impossible in these cases! That’s what I mean by a dilemma :)

5.1.9 Cognitive Limits

  • If there are limits, salesman could be a dilemma
  • I think, though I’m not relying on it, same is true for assuming can’t randomise
  • Why say that someone can’t?
  • Stipulation: But then could stipulate that they can’t solve salesman
  • Realism (quote Lewis): But we can’t solve salesman. In fact, we’re better at randomising than salesman
  • Punishment: Then we’ve given up ideals. Compare my NBA example
  • No good argument really
  • But it doesn’t matter, because in Outguess, if Demon does 50/50 if Chooser does 50/50, still a dilemma.

5.2 The Possibility Assumption in Philosophical Arguments

A bad argument form, and why it matters

5.2.1 An Argument against EDT {(badargumentedt?)}

  • Open Ended Good
  • Of course, this is a bad argument

5.2.2 A Recipe for Counterexamples

Once you see the idea behind the argument in subsection ??, it’s easy to see how to construct ‘counterexamples’ to any decision theory that allows for dilemmas. For instance, here is a general purpose recipe for constructing counterexamples to most forms of Causal Decision Theory, causal defensivism included.

Step One
Find some game where one player has demon-like payouts, and there is no pure strategy equilibrium for the other player. By a ‘demon-like payout’, I mean that player’s payouts are all 1s and 0s, and you can sensibly interpret the 1s as situations where they correctly predicted the other player’s choice.
Step Two
Translate that game into a demonic decision problem, taking care to stipulate that the human player is incapable of playing mixed strategies, or that something very bad will happen to them if they do.8
Step Three
Look at the pure strategies the player has, and identify the one that it would be least plausible to have one’s theory recommend.
Step Four
Argue, correctly, that all strategies other than the one you’ve identified are irrational according to CDT.
Step Five
Infer that the remaining strategy is the one that CDT recommends.
Step Six
Point out that it is really unintuitive that CDT would recommend that strategy, declare that this is a clear counterexample to CDT, declare victory, etc.

Hopefully the discussion in section @(badargument) will have made it clear that we can run this recipe against just about any theory, so it overgenerates. And not just that, we can identify the misstep - it’s step five.

5.2.3 An Instance of the Recipe

  • One of the Frustrator examples

5.2.4 Betting Against the Demon

There is a slight variation on the recipe in an interesting example due to Arif Ahmed (2014, sect 7.4.3). Here the plan at step 6 is not to argue that CDT gets the wrong result, in fact Ahmed endorses the conclusion he thinks CDT ends up with, but to argue that CDT undermines its own principles. I’m going to lean a bit on the discussion of the example in the review of Ahmed’s book by James Joyce (2016), though I end up drawing a slightly different conclusion to Joyce about the argument. And I’m going to start with a simplified version of the example, where I think it’s clearer where things go wrong, and build up to the version Ahmed gives. (I’ve also relabeled the moves, because I find the labels here more intuitive.)

The simple version of Ahmed’s example, which we’ll call Betting Against the Demon, has three stages.

Stage One
Demon predicts whether Chooser will choose 1 box or 2 boxes at stage two.
Stage Two
Chooser chooses 1 box or 2 boxes. They receive $100 if Demon predicted they will choose 1 box, and nothing if Demon predicted they will choose 2 boxes, whatever they choose. This payout is not revealed to them until the end.
Stage Three
Chooser selects one of two bets. Bet R (for right) wins $25 if Demon predicted correctly, and loses $75 if Demon predicted incorrectly. Bet W (for wrong) wins $75 if Demon predicted incorrectly, and loses $25 if Demon predicted correctly.

After this the moves are revealed, and Chooser gets their rewards. The strategic form of the game is shown in figure ??. I’m assuming Demon wants to make correct predictions, and ignoring strategies that differ only in what Chooser does in worlds that are ruled out by their Stage Two choice. (I’ll leave it as an exercise for the reader to confirm that they aren’t relevant to the analysis.)

Table 5.4: The strategic form of Betting Against the Demon















For Chooser, each row sets out what they do at stage two, and what they do at stage three - the number is whether they pick 1 box or 2, and the letter is whether they take bet R or W. For Demon, P1 is that they predict 1 box, and P2 is that they predict 2 boxes.

It should be clear enough that there are no pure strategy Nash equilibria for this game. The best response to P1 is 2W, but the pair of 2W and P1 is not a Nash equilibrium. And the best response to P2 is 1W, but the pair of 1W and P2 is not a Nash equilibrium.

But of course this game is not a strategic form game, it’s an extensive form game. Figure 5.7 is the game tree for it.

The game tree for the betting against the demon example

Figure 5.7: The game tree for the betting against the demon example

Demon moves first, but Chooser is not alerted to Demon’s move until the end of the game. So at every stage Chooser has two nodes in their information set - one for each of Demon’s possible moves. But Chooser does know their own prior move, so at stage three the information sets do not include both the top and bottom of the tree. This looks a lot like a signaling game, but there are some notable differences. There is no move from Nature here; rather Demon moves once and Chooser moves twice. And there is an information set connecting the nodes in the middle-left and middle-right of the tree.

Since it is a game where Chooser moves twice, the relevant game theoretic concept to use is Bayesian Perfect Equilibrium. A strategy is defensible for Chooser iff it is part of a Bayesian Perfect Equilibrium. Since all Bayesian Perfect Equilibria are Nash Equilibria, and the game has no Nash Equilibria, there are no defensible moves. But the fact that it is a dynamic game is important to Ahmed’s argument.9

Assuming that Chooser plays a pure strategy, in equilibrium Demon knows that strategy, and Chooser knows this. So at stage three, Chooser will believe that choosing R will win $25 more or less for sure, and choosing W will lose $25 more or less for sure. So Ahmed infers, correctly, that CDT says that Chooser should not choose W. From this he infers, incorrectly, that CDT says that Chooser should choose R.10 This doesn’t follow without an extra, incorrect, assumption that CDT says that something should be chosen in this case. But in fact, since the game has no pure strategy equilibria, that isn’t true. Anyway, from that false assumption, Ahmed infers, correctly, that CDT recommends playing some strictly dominated strategy or other. And that’s incoherent, since CDT is motivated by the thought that one should never play strictly dominated strategies.

I’ve simplified matters a little here, but not in a way that matters. Ahmed doesn’t say that Demon is arbitrarily accurate, but only that Demon is 80% accurate. But it’s easy enough to modify the game to accommodate this. In fact there are two ways to do so, following the two tricks I introduced in section (familiar). The Smullyan approach is to have a stage 0, where the Demon is assigned to one of two types: Knight or Knave. The former has an 80% chance of being the assignment. The Demon is told the assignment but Chooser is not. And if Demon is assigned Knave, their payouts are flipped - they get 1 if they play P1 and Chooser selects 2, or they play P2 and Chooser selects 1. The Selten approach is to have a stage 1A, between Demon and Chooser’s moves. Demon chooses P1 or P2, then at stage 1A, there is a 20% chance that Nature will reverse Demon’s choice, and the payouts will be a function not of what Demon did, but what happened after Nature modified Demon’s choice. The trees for either case are impossibly messy, and I won’t draw them. But it doesn’t matter - the equilibrium analysis of either game is exactly the same as the equilibrium analysis of the game where Demon is arbitrarily accurate, and I’ll stick with that game from now on.

So does CDT recommend playing a dominated strategy? Of course not. As Joyce says, CDT is “built to guarantee” that it doesn’t recommend dominated strategies (Joyce 2016, 231). What it says, or at least what causal defensivism says, is two-fold.

If Chooser can randomise, they should randomise. They should play the one and only equilibrium strategy in this game, which is a 50/50 mixture of 1W and 2W. In equilibrium, Demon will play a 50/50 mixture of P1 and P2. So at stage 3, Chooser will think it is 50% likely that Demon has ‘predicted’ correctly. (The scare quotes are because calling the output of Demon’s mixed strategy a prediction stretches language just about to breaking point.) So Chooser will be happy sticking with their strategy at stage 3 - it will have an expected return of $25, and the alternative has an expected return of -$25.

If Chooser can’t randomise, then it’s a dilemma. What we say here should be similar to what EDT, or any other decision theory, says about the example in section 5.2.2. But what is that? Let’s come back to that question in a bit - first I’ll address some objections to the very idea that there could be dilemmas in decision theory.

5.3 Why Allow Dilemmas

5.3.1 Arguments Against Dilemmas

  • Decision theory assigns values
    • This doesn’t rule out all dilemmas
    • And we reject the assumption
  • Decision theory should be guiding
    • Reject the assumption
  • Shouldn’ t treat every option the same
    • This is better argument, but we can reject it. Some are fine

5.3.2 What Dilemmas are Like

Be like a smart rationally constrained person

Use heuristics that work much of the time, on average

This might end up looking a lot like EDT

Maybe this is why Newcomb’s Problem is so hard

  1. I’m assuming here that the pure strategies U and D are improper mixtures, where Chooser plays one of them with probability 1, and the other with probability 0. This assumption isn’t at all needed for the proof; but it simplifies the presentation.↩︎

  2. I’m a bit perplexed as to why this stipulation is so widely accepted in the decision theory literature. But it usually seems to be accepted with minimal fuss. We’ll return to this point in section ??.↩︎

  3. I’m only going to discuss what Ahmed says about CDT combined with so-called ‘sophisticated’ choice, since it’s the right way to make sequences of decisions.↩︎

  4. Strictly speaking, what he shows is that this follows given that we are using the version of CDT described by Lewis (1981a). But he says earlier, in section 2.8 of his book, that he thinks all versions of CDT will agree with Lewis’s version about everything that is covered in the book. As he puts it, “As far as this discussion is concerned, any one of them could stand for their disjunction.” This is false, and importantly so at just this point. The version of CDT developed by Skyrms (1990) for instance would not agree with Lewis here, and would not say that any pure strategy is rational. Ahmed does discuss some of Skyrms’s earlier work, but not the version of CDT that is developed from 1990 onwards using equilibrium concepts.↩︎