11  Screening and Regresses

Normative externalism in epistemology is false if agents should respond not just to their evidence, but to what they believe, or should believe, about what their evidence supports. Call that latter claim the higher order hypothesis. Over the last three chapters I’ve responded to arguments for the higher order hypothesis. I argued that the cases that apparently support the higher order hypothesis do not do so, when viewed in the context of a wider sweep of cases. I’ve argued against attempts to derive the higher order hypothesis from anti-circularity principles. And I’ve argued against attempts to derive it from enkratic principles. In this chapter I move on to giving reasons to disbelieve the higher order hypothesis. I argue that the higher order hypothesis is tied to a principle about screening, a principle I call Judgments Screen Evidence. And I argue that this principle, whose name I’ll shorten to JSE, leads to intolerable regresses. The return to regress based arguments provides a stronger link than we’ve seen so far between this part of the book and the earlier part; the arguments of this chapter might seem very familiar to someone who has read chapter 4.

11.1 Screening

The idea of screening that’s going to be central to this chapter comes into philosophy via Hans Reichenbach (1956). He was working on a quite different problem, namely when we should infer that two events have a common cause. He says that C screens off the positive correlation between B and A iff the following two conditions are met.

  1. A and B are positive correlated, i.e., Pr(A | B) > Pr(A).
  2. Given C, A and B are probabilistically independent, i.e., Pr(A | B ∧ C) = Pr(A | C).

I’m interested in an evidential version of screening. If we understand evidential support probabilistically, then we could just copy over Reichenbach’s definitions, with a little reinterpretation of the formalism. So rather than thinking of Pr $ in terms of objective processes, as Reichenbach was, think of it as an evidential probability function. Then these two clauses will say that as things stand, B is evidence for A, but given C, B is no evidence for A. We can say all that without assuming any particular connection between probability and evidence, as follows.

C screens off the evidential support that B provides to A iff:

  1. B is evidence for A; and
  2. BC is exactly as good evidence for A as C is.

Both these clauses, as well as the statement that C screens off B from A, are made relative to an evidential background. I’ll leave that as tacit in what follows. Here are a couple of examples, the second loosely based on facts, that illustrate the usefulness of this idea.

A woman is standing at a suburban train station waiting for a train into the city, and wondering whether she will be on time for her meeting. She knows that there is only one train line, with no usable sidings, between where she is and the city, so there isn’t any chance of trains passing. She knows how long trains take to get to the city if everything is working, though she doesn’t know if everything is indeed working. But she doesn’t know how frequent the trains are. She gets a call from a friend saying that a train to the city is headed her way, and is about five miles away. That train would, she thinks, get her to the city in time if everything goes right. Just then she sees a train coming into the station. Let A be that she gets to the city on time, B that there is a train five miles away, and C that there is a train pulling into the station. Relative to her initial background, B is evidence for A. But given C, it is no evidence at all. That’s because given C, what matters is whether this particular train makes it in on time, without breaking down or being held up for some reason. The later train can’t pass her, so its presence isn’t relevant to whether she makes it to the city on time.

Later, she is trying to work out whether a particular person X voted for the Democratic candidate or the Republican candidate at the last Presidential election. She knows that X is either from Alabama or Massachusetts, and voted, and she knows the distribution of voters in those two states are as follows. (The numbers in the boxes are percentages of voters, and GOP is shorthand for the Republican Party.)

  | Pro-Choice | Pro-Life | Pro-Choice | Pro-Life |
  | Dem | Dem | GOP | GOP |

|:—|:—:|:—:|:—:|:—:| | Alabama | 28 | 7 | 7 | 58 | | Massachusetts | 52 | 13 | 13 | 22 |

Learning which state X is from is strong evidence about how they voted, since 65% of Massachusetts voters voted Democratic, while only 35% of Alabama voters did. But if she had previously learned that X was pro-choice, then learning which state X is from would be of no evidential significance. That’s because 80% of pro-choice voters in each state voted Democratic. So learning that X is a pro-choice resident of Massachusetts is of no more evidential significance than simply learning X is pro-choice.

There is something very interesting about this theoretical possibility. We can concede that something is usually evidentially significant even while denying it is significant on a particular occasion. This possibility is useful for solving a puzzle about judgment.

11.2 The Counting Problem

Suppose a rational agent has some evidence E that bears on a proposition p, and on that basis judges that p. Call the fact that the agent has made this judgment J, and assume the agent is self-aware enough to know that J is true, and that she is rational. Assume also that p is a rational thing to judge on the basis of E, though the agent does not necessarily know this. The fact that a rational person judges that p seems to support p. After all, if we found out that she is rational and judged that p, that would ceteris paribus be evidence for p. Now consider this slightly informal question: How many pieces of evidence does the agent have that bear on p? Three options present themselves.

  1. Two - both J and E.
  2. One - E subsumes whatever evidential force J has.
  3. One - J subsumes whatever evidential force E has.

This suggests a trilemma. First, it seems J could be evidence for p. We could get reason to be more confident in p just by learning J. Second, it seems like double counting for the agent to take both E and J to be evidence. After all, she only formed the judgment because of E. Yet third, it seems wrong for her to simply ignore E, since by stipulation it is evidence, and it certainly seems to bear on whether p is true.

One way out of this is to adopt the thesis I’ll call JSE, for Judgment Screens Evidence. This is the thesis that propositions about rational judgments by rational agents screen off the evidential significance of the underlying evidence behind those judgments. The simplest argument for JSE is that it lets us answer the question above while accommodating the idea behind all three sources of ‘pressure’. The agent can treat J just like everyone else does, i.e., as some evidence for p, without double counting or ignoring E. She can do that because she treats E as screened off. And screened off evidence isn’t double counted or ignored. That’s a rather nice feature of JSE.

To be sure, it is a feature that JSE shares with a view we might call ESJ, or evidence screens judgments. That view says that the agent shouldn’t take J to be extra evidence for p,since its evidential force is screened off by E. This view also allows for the agent to acknowledge that J has the same evidential force for her as it has for others, while also avoiding double counting. So we need some reason to prefer JSE to ESJ.

One reason is by thinking generally about reasoning that proceeds in steps. Assume E is evidence for p solely because it makes q more likely, and q in turn makes p more likely. So if we are investigating a crime that took place in an inland village in Cornwall, learning that a suspect had some sand in his clothes that is only found on Cornish beaches may be some evidence that he’s guilty. That’s because it establishes that the suspect was at least in the area, unlike some other suspects. But if we knew independently that the suspect had been in Cornwall, say because he owns a beach house there and is often seen by his neighbours, the presence of the sand is of no evidential significance. Perhaps the general lesson here is that later steps screen off earlier steps. If that’s right, we would expect J to screen E, and not vice versa.

Another reason for preferring JSE to ESJ is that it alone supports a number of positions that epistemologists have found independently plausible. Indeed, it is arguable that JSE is something of a tacit premise in a number of arguments. In the next section we will look at three such arguments.

11.3 JSE in Epistemology

11.3.1 Egan and Elga on Self-Confidence

We’ll start with some conclusions that Andy Egan and Adam Elga draw about self-confidence in their paper “I Can’t Believe I’m Stupid”. I suspect many of the conclusions they draw in that paper rely on JSE, but I’ll focus just on the most prominent use of JSE in the paper.

One of the authors of this paper has horrible navigational instincts. When this author—call him “AE”—has to make a close judgment call as to which of two roads to take, he tends to take the wrong road. If it were just AE’s first instincts that were mistaken, this would be no handicap. Approaching an intersection, AE would simply check which way he is initially inclined to go, and then go the opposite way. Unfortunately, it is not merely AE’s first instincts that go wrong: it is his all things considered judgments. As a result, his worse-than-chance navigational performance persists, despite his full awareness of it. For example, he tends to take the wrong road, even when he second-guesses himself by choosing against his initial inclinations.

Now: AE faces an unfamiliar intersection. What should he believe about which turn is correct, given the anti-reliability of his all-things-considered judgments? Answer: AE should suspend judgment. For that is the only stable state of belief available to him, since any other state undermines itself. For example, if AE were at all confident that he should turn left, that confidence would itself be evidence that he should not turn left. In other words, AE should realize that, were he to form strong navigational opinions, those opinions would tend to be mistaken. Realizing this, he should refrain from forming strong navigational opinions (and should outsource his navigational decision-making to someone else whenever possible).  (Egan and Elga 2005, 82–83)

I will argue that this reasoning goes through iff JSE is assumed. I’ll argue for this by first showing how the reasoning could fail without JSE, and then showing how JSE could fix the argument.

Start with a slightly different case. Katell is trying to find out whether p, where this is something she knows little about. She asks ten people whether p is true, each of them being someone she has good reason to believe is an expert. The experts have a chance to consult before talking to her, so each of them knows what the others will advise. Nine of them confidently assure her that p is true. The tenth is somewhat equivocal, but says that he suspects p is not true, although he cannot offer any reasons for this suspicion that the other nine have not considered. It seems plausible in such a case that she should, or at least may, accept the supermajority’s verdict, and believe p.

Now vary the case. The first nine are experts, but the tenth is an anti-expert. He is wrong considerably more often than not. Again, the first nine confidently assert that p, but now the tenth says the same thing, i.e., p. This doesn’t change Katell’s epistemic situation. She has a lot of evidence for p, and a little evidence against it. The evidence against has changed; it is now the confident verdict of an anti-expert, rather than the equivocal anti-verdict of an expert, but this doesn’t matter. So she still should, or at least may, believe p.

Now make one final variation. Katell is the tenth person consulted. She asks the first nine people, who of course all know each other’s work, and they all say p. She knows that she has a tendency to make a wrong judgment in this type of situation – even when she has had a chance to consult with experts. Perhaps p is the proposition that the correct road is to the left, and she is AE, for example. It does require some amount of hubris to continue to be an anti-expert even once you know you are one, and the contra-indicating judgments are made in the presence of expert advice. But I don’t think positing delusionally narcissistic agents makes the case unrealistic. After listening to the experts, she judges that p. This is some evidence that ¬p , since she is an anti-expert. But, as in the last two paragraphs, it doesn’t seem that it must override all the other evidence she has. So, even if she knows that in general she is fairly anti-reliable on questions like p, she need not suspend judgment. Even if her judgment is some evidence that ¬p , it might not be strong enough to defeat her earlier evidence for p. On those (presumably rare) occasions where her judgment tracks the evidence, the evidence may be strong enough for me to keep it, even once she acknowledges she have made the judgment.

The previous paragraph assumed that JSE did not hold. It assumed that Katell could still rely on the nine experts, even once she had incorporated their testimony into a judgment. That’s what JSE denies. According to JSE, the arguments of the previous paragraph rely on illicitly basing belief on screened-off evidence. That’s bad. If JSE holds, then once Katell makes a judgment, it’s all the evidence she has. Now assume JSE is true, and that Katell knows herself to be something of an anti-expert. Then any judgment she makes is fatally self-undermining, just like Egan and Elga say. When she makes a judgment, she not only has evidence it is false, she has undefeated evidence it is false. So if Katell knows she is an anti-expert, she must suspend judgment. That’s the conclusion Egan and Elga draw, and it seems to be the right conclusion iff JSE is true. So the argument here relies on JSE.

11.3.2 White on Permissiveness

Roger White (2005) argues that there cannot be a case where it could be epistemically rational, on evidence E, to believe p, and also rational, on the same evidence, to believe ¬p . One of the central arguments in that paper is an analogy between two cases.

Random Belief
S is given a pill which will lead to her forming a belief about p. There is a ½ chance it will lead to the true belief, and a ½ chance it will lead to the false belief. S takes the pill, forms the belief, a belief that p as it turns out, and then, on reflecting on how she formed the belief, maintains that belief.

Competing Rationalities
S is told, before she looks at E, that some rational people form the belief that p on the basis of E, and others form the belief that ¬p on the basis of E. S then looks at E and, on that basis, forms the belief that p.

White claims that S is no better off in the second case than in the former. As he says,

Supposing this is so, is there any advantage, from the point of view of pursuing the truth, in carefully weighing the evidence to draw a conclusion, rather than just taking a belief-inducing pill? Surely I have no better chance of forming a true belief either way.  (White 2005, 448)

There are two ways to read the phrase “from the point of view of pursuing the truth”. One of them leads to an implausible view about the role of rational reflection in inquiry. The other makes the argument rely on JSE. Take these in order.

First, assume White’s narrator is only concerned about having a truthful opinion right now, and only having a truthful opinion on this very question. Given that, it will be true that the belief-inducing pill will do just as well as careful weighing of the evidence. But that’s a very unusual set of interests to have, and it’s not clear why we should take such a person to show us much of interest about the point of reflection. One generally good reason for weighing the evidence carefully is that it puts us in a better position to be able to process new evidence as it comes in. It isn’t clear how White’s narrator, who takes the belief-inducing pill, will be able to adjust to new evidence, since by hypothesis he doesn’t have any sense of how well entrenched this belief should be, and how sensitive it should be to counterveiling evidence. This point is closely related to the explanation Socrates gives for the superiority of knowledge to mere true belief in Meno 97d-98a.

Another good reason for weighing evidence carefully is that we learn about other propositions through this process. Assume we’re trying to figure out whether p, and there is some other proposition q, such that (a) we care about whether q is true, and (b) p is sometimes, but not always, good evidence for q. It is very common that at least some such proposition exists. Then figuring out why p is true, or at least why we should think it is true, will be relevant for q. So an agent who only cares about having at this very moment a true belief about this very proposition might be no better off engaging in rational reflection than taking White’s belief-inducting pill, but such agents are far removed from the usual situation we find ourselves in, and not good guides to epistemological generalisation.

But note that with JSE we don’t need to restrict attention to such narrowly-defined agents. Assume that JSE is true. Then after S evaluates E, she forms a judgment, and J is the proposition that she formed that judgment. Now it might be true that E itself is good evidence for p. (The target of White’s critique says that E is also good evidence for ¬p , but that’s not yet relevant.) But given JSE, that fact isn’t relevant to S’s current state. For her evidence is, in its entirety, J. And she knows that, as a rational agent, she could just as easily have formed some other judgment, in which case J would have been false. Indeed, she could have formed the opposite judgment. So J is no evidence at all, and she is just like the person who forms a random belief, contradicting the assumption that believing p could, in this case, be rational, and that believing ¬p could be rational.

Without JSE, White’s analogy breaks down. Forming a belief via a pill, and forming a belief on the basis of the evidence, are very different. That’s true even if you know that other rational agents take the evidence to support a different conclusion. The random belief is incapable of being properly updated, or of supporting the correct strands elsewhere in the web of belief.

If we care about getting at the truth in general, and not just about p, then White’s analogy needs JSE to go through. And we should, and do, care about truth in general. So this argument against permissiveness needs JSE. There may be other arguments against permissiveness, so this isn’t to say that White’s conclusion requires JSE. But his argument does.

11.3.3 Disagreement and Priority

Here is Adam Elga’s version of the Equal Weight View of peer disagreement, a theory we will discuss much more in chapter 12.

Upon finding out that an advisor disagrees, your probability that you are right should equal your prior conditional probability that you would be right. Prior to what? Prior to your thinking through the disputed issue, and finding out what the advisor thinks of it. Conditional on what? On whatever you have learned about the circumstances of the disagreement.  (Elga 2007, 490)

It is easy to see how JSE could help defend this view. First, focus on the role JSE can play in the clause about priority. Here is one kind of situation that Elga wants to rule out. S has some evidence E that she takes to be good evidence for p. She thinks T is an epistemic peer. She then learns that T, whose evidence is also E, has concluded ¬p . She decides, simply on that basis, that T must not be an epistemic peer, because T has got this case wrong. This decision violates the Equal Weight View, because it uses S’s probability that T is a peer after thinking through the disputed issue, not prior to it, in deciding who is more likely to be right.

Now at first it might seem that S isn’t doing anything wrong here. If she knows how to apply E properly, and can see that T is misapplying it, then she has good reason to think that T isn’t really an epistemic peer after all. She may have thought previously that T was a peer, indeed she may have had good reason to think that. But she now has excellent evidence, gained from thinking through this very case, to think that T is not a peer and so not worthy of deference.

Since Elga thinks that there is something wrong with this line of reasoning, there must be some way to block it. The best option for blocking it comes from ruling that E is not available evidence for S once she is using J as a judgment. That is, the best block available comes from JSE. Once we have JSE in place, we can say what S does wrong. She is like the detective who says that we have lots of evidence that the suspect could have committed the crime–not only does he live in Cornwall, but he has Cornish sand in his clothes. To make the cases more analogous, we might imagine that there are detectives with competing theories about who could be guilty in this case. If we don’t know who was even in Cornwall, then the evidence about the sand may favour one detective’s theory over the other. If we do know that both suspects live in Cornwall, then the evidence about the sand isn’t much help to either.

So JSE supports Elga’s strong version of the Equal Weight View, which bars agents from using the dispute at issue as evidence concerning the peerhood of another. And if JSE is not true, then there is a kind of simple and natural reasoning which undermines Elga’s Equal Weight View. So Elga’s version of the Equal Weight View requires JSE.

11.4 JSE and Higher Order Evidence

As noted above, JSE can also support the higher order hypothesis. The idea is reasonably simple. Assume that an agent gets evidence that is in fact good evidence for p, concludes p on that basis, but also has reason to think they are in a sub-optimal epistemic environment. The believer in higher-order evidence thinks the agent should then lower their confidence in p. But why is that, when they already have excellent evidence for p, and the evidence about the environment doesn’t seem to defeat that?

Let’s make that last rhetorical question a little clearer. Danail tells Milica that p. Milica has a long relationship with Danail, and he has been a very reliable testifier over that time. And Milica has no reason to doubt that p. But then Milica learns she has taken a drug that makes most people very unreliable when it comes to processing evidence by testimony. Should this last evidence reduce her confidence in p, by somehow defeating the support that Danail’s testimony provides? The evidence about the drug isn’t a rebutting defeater; it provides no reason to think p is false. But nor is it the most natural kind of undercutting defeater. It provides no reason to think that Danail is an unreliable testifier. What it does is undercut any support that Milica’s own judgment gives to p. But that only matters to what Milica should believe if that judgment is playing an important role in sustaining her belief. And that’s where JSE comes in. Unless JSE is true, Milica has a completely sound reason to believe p, namely Danail’s testimony. And that reason isn’t defeated by the drug. If a third party believed p because they knew that Milica believed it on testimonial grounds, then the drug would be an undercutting defeater to the third party’s belief. But to make it a defeater to Milica’s belief, we need to assume that Milica, like the third party, in some way bases her sustained belief on her judgment. If JSE is right, then in a good sense she does do that; her own judgment that p is her only unscreened evidence, and if the force of it is defeated, then she has no good reason to believe p. If JSE is wrong, it is harder to see the parallel between Milica and the third party.

I’ve sketched an argument that the higher order hypothesis not just could be supported by JSE, but would be undermined if JSE were false. And JSE is indeed false, as we’ll now show. We’ll return at the end of the chapter to whether this fact can be turned into an argument against the higher order hypothesis.

11.5 The Regress Objection

Ariella is trying to make a forecast for how well her hometown team, the Detroit Tigers, will do in the upcoming baseball season. Baseball teams play 162 games1, and the Tigers look like being a relatively mediocre team. She knows that it is irrational to form any belief about precisely how many games the Tigers will win. But she thinks, correctly as it turns out, that it is reasonable to form a credal distribution over propositions of the form The Tigers will win n games, and have that distribution be roughly a normal distribution with a standard deviation of 5 games. The question is to work out what the most likely win total is, which will be both the mode and the mean of the distribution. For simplicity, we’ll say that for her to predict that The Tigers will win n games is to set n to be this centre. (I don’t mean to suggest this is the ordinary use of the English word ‘predict’. The definition I’m using here is stipulative.)

  • 1 In reality they sometimes play 1 or 2 more or less. It will simplify the exposition to assume it is known in advance in Ariella’s world that they play 162 exactly, and that’s what I will assume.

  • Ariella works through the known facts about the Tigers and their opponents, and predicts that they will win 76 games. This is, as it turns out, exactly the right prediction to make. That isn’t to say the Tigers will actually win 76 games - remember the point here is not to form outright beliefs. Rather, the appropriate credal distribution over propositions about the Tigers’ win total, given Ariella’s evidence, is centred on 76.

    But Ariella knows something about herself. She knows that in general, when she settles on a prediction, it is 1 game too low when it comes to the Tigers. If someone else knew nothing other than that Ariella had predicted the Tigers would win 76 games, and Ariella’s track record, the rational thing for them to do would be to predict the Tigers will win 77 games. So Ariella has higher-order evidence that one might think will move her to change her prediction from 76 to 77.

    Note carefully though what Ariella knows about herself. She knows that it is when she settles on a prediction that it is on average 1 game too low. If she decides that 76 wasn’t a settled prediction, but 77 is, then she has exactly the same reason to raise her prediction to 78. And if she settles on that, she has a reason to raise her prediction to 79, and so on. Higher-order evidence is an issue because someone can have evidence that they make systematic mistakes in forming beliefs on the basis of evidence. But those systematic mistakes could also concern how they form beliefs on the basis of higher-order evidence. Indeed, they could be the same systematic mistakes in both cases. What should be done?

    Let’s start with three very bad ideas for Ariella. She should not simply follow the higher-order evidence where it leads, first raising her prediction to 76, then 77, then 78 and so on. After 87 steps, she will predict that the Tigers will win 163 games. Given that it is a 162 game season, this is not a good idea. Nor should she follow through as many steps of higher-order reasoning as she has the cognitive capacity to do. Assuming she has the ability to add 1 repeatedly, that will lead to the same flaw as above. And nor should she simply get out of the business of making predictions about baseball. (Compare Egan and Elga’s comment that AE should simply stop making judgments about where to turn; a comment that was about one particular case of course, and not a general piece of advice.) Given what I’ve said so far about Ariella, she’s really good at these kind of predictions. Having a small systematic error like this is not that much of a flaw, given how good she otherwise is.

    There are three other strategies for dealing with higher-order evidence that are at least plausible. The first is the one I will defend. It is that Ariella should simply stick with her original prediction because it is the best prediction to make given her evidence. The second is that she should find some equilibrium point, where the higher-order evidence does not recommend a change of view. As stated, this view won’t say anything about what Ariella should do, because there is no equilibrium position. But perhaps the view could be extended to say that she should follow the first-order evidence if there is no equilibrium, so it will also say that she should stick with her original prediction. The third option is that Ariella should follow one step of higher-order evidence, then stop with the prediction that the Tigers will win 77 games. I’ll argue for the first option by arguing against the other two.

    Start with the idea that Ariella should, if possible, settle on an equilibrium. The idea is that we avoid the regress by saying that when possible, rational agents should be such that when they add the fact that they made that judgment to their evidence, the rational judgment to make given the new evidence has the same content as the original judgment. So if one is rational, and predicts that p, the rational prediction given that one has made the prediction that p is still p.

    Note that this isn’t as strong a requirement as it may first seem. The requirement is not that any time an agent makes a judgment (or prediction), rationality requires that they say on reflection that it is the correct judgment. Rather, the requirement is that when possible, rational agents make those judgments that, on reflection, they would reflectively endorse. We can think of this as a kind of ratifiability constraint on judgment, like the ratifiability constraint on decision making that Richard Jeffrey (1983) uses to handle Newcomb cases.

    A judgment is ratifiable for agent S just in case the rational judgment for S to make conditional on her having made that judgment has the same strength and content as the original judgment. The regress is blocked by saying rational agents make ratifiable judgments when possible. If the agent does do that, there isn’t much of a problem with the regress; once she gets to the first level, she has a stable view, even once she reflects on it.

    This assumption, that only ratifiable judgments are rational, drives much of the argumentation in Egan and Elga’s paper on self-confidence; it is a serious option. As the comparison to Jeffrey suggests, it has some historical pedigree. And though this would take much longer to show, it is probably the best way to make sense of the emphasis on equilibrium concepts in game theory. Nevertheless it is false. I’ll first note one puzzling feature of the view, then one clearly false implication of the view.

    The puzzling feature is that, as we have already seen, there need not be any ratifiable judgment to make. So the view will be somewhat incomplete. But maybe that isn’t such a bad thing. We imagine the ratifiability theorist saying the following two things. (This isn’t the only way to extend the ratifiability view, but I won’t be objecting to this extension.)

    1. It is important to make ratifiable judgments. Any judgment that is not ratifiable is not rational.
    2. It is better, other things being equal, to have judgments that track the evidence.

    This view will say that Ariella faces an epistemic dilemma. Anything she does will be to some extent irrational, since it will not be ratifiable. But the least bad option will be to predict that the Tigers will win 76 games, as she does. If you think epistemic dilemmas are impossible, you won’t like this way of thinking about Ariella. But I don’t think the arguments against epistemic dilemmas are particularly strong. If this was the worst thing to say about the ratifiability view then it would look like a reasonable view.

    But it isn’t the worst thing to say about the ratifiability view. The problem arises in cases where there is a ratifiable judgment. Change the case a little so Ariella doesn’t tend to overpredict Tigers losses by 1 game; she tends to overpredict them by 1%. So if she predicts the Tigers will lose 86 games, an outsider going off that prediction and her track record wouldn’t predict the Tigers will lose 85 games, they will predict the Tigers will lose 85.14 games. (Remember given the stipulated meaning of ‘predict’ we’re using here, it can be perfectly sensible to predict that teams will win a fractional number of games. Indeed, there is no particular reason to think that the centre of the credal distribution over Tiger wins will fall on an integer. Remember also that there are 162 games in a season, so predicting 76 wins just is predicting 86 losses.)

    Changing the expected error from a game to a percent doesn’t seem like a big change at first blush. But now there is a ratifiable prediction for Ariella. It is that the Tigers will win 162 games, and lose 0. So if we think Ariella should make ratifiable predictions where possible, we should conclude that whatever her evidence about the Tigers hitting, pitching and fielding, she should predict they will win all 162 games in the season. This can’t be right.

    This kind of case proves that it isn’t always rational to have ratifiable credences. It would take us too far afield to discuss this in detail, but it is interesting to think about the comparison between the kind of case I just discussed, and the objections to backwards induction reasoning in decision problems that have been made by Pettit and Sugden (1989), and by Stalnaker (1998). The backwards induction reasoning they criticise is a development of the idea that judgments should be ratifiable. And the clearest examples of when that idea fails are cases where there is a unique ratifiable judgment, and it is a judgment that first order considerations tell strongly against. The example of Ariella has, quite intentionally, a similar structure.

    The other option for blocking the regress is to say that there is something special about the first revision. So if Ariella predicts that the Tigers will win 76 games, that screens her evidence about the Tigers’ hitting, pitching and fielding. But if she changes her mind and predicts that they will win 77 games, on the basis of the higher order evidence, that doesn’t screen her original prediction that they will win 76. So the regress doesn’t even get started. This is structurally similar to a move that Adam Elga (2010) makes about disagreement. He argues that we should adjust our views about first-order matters in (partial) deference to our peers, but we shouldn’t adjust our views about the right response to disagreement in the same way.

    It’s hard to see what could motivate such a position, either about disagreement or about screening. It’s true that we need some kind of stopping point to avoid these regresses. But the most natural stopping point is before the first revision. Consider a toy example. It’s common knowledge that there are two apples and two oranges in the basket, and no other fruit. (And that no apple is an orange.) Two people disagree about how many pieces of fruit there are in the basket. A thinks that there are four, B thinks that there are five, and both of them are equally confident. Two other people, C and D, disagree about what A and B should do in the face of this disagreement. All four people regard each other as peers. Let’s say C’s position is the correct one (whatever that is) and D’s position is incorrect. Elga’s position is that A should partially defer to B, but C should not defer to D. This is, intuitively, just back to front. A has evidence that immediately and obviously entails the correctness of her position. C is making a complicated judgment about a philosophical question where there are plausible and intricate arguments on each side. The position C is in is much more like the kind of case where experience suggests a measure of modesty and deference can lead us away from foolish errors. If anyone should be sticking to their guns here, it is A, not C.

    The same thing happens when it comes to screening. Remove B from the example and instead assume that A has some evidence that (a) she has made some mistakes on simple sums in the past, but (b) tends to massively over-estimate the likelihood that she’s made a mistake on any given puzzle. What should she do? One option, in my view the correct one, is that she should believe that there are four pieces of fruit in the basket, because that’s what the evidence obviously entails. Another option is that she should be not very confident there are four pieces of fruit in the basket, because she makes mistakes on these kinds of sums. Yet another option is that she should be pretty confident (if not completely certain) that there are four pieces of fruit in the basket, because if she were not very confident about this, this would just be a manifestation of her over-estimation of her tendency to err. The ‘solution’ to the regress we’re considering here says that the second of these three reactions is the uniquely rational reaction. The idea behind the solution is that we should respond to the evidence provided by first-order judgments, and correct that judgment for our known biases, but that we shouldn’t in turn correct for the flaws in our self-correcting routine. I don’t see what could motivate such a position. Either we just rationally respond to the evidence, and in this case just believe there are four pieces of fruit in the basket, or we keep correcting for errors we make in any judgment and start a regress.

    11.6 Laundering

    In the definition of JSE, I said it was restricted to rational judgments. This was to avoid a simple counterexample to the view. (I’m indebted here to Vann McGee for pointing out the need for this.) Vieno is usually a pretty reliable judge, and he’s not currently drunk or otherwise incapacitated. But he makes a mistake, as we all do sometimes, and forms the belief that p on the basis of massively insufficient evidence. This is rather irrational. Again, that’s not to say that Vieno himself is irrational, but he does have a particular irrational view.

    Now assume that JSE were true in an unrestricted form. Vieno is a generally reliable judge. That he believes p is, on its own, pretty good evidence for p. If the underlying evidence E is screened off, then arguably the overall evidence does suggest that p, so Vieno’s belief does track his evidence after all. More generally, if unrestricted JSE is right, then it is impossible for someone who knows themselves to be generally reliable to have an irrational belief. So unrestricted JSE must be wrong.

    But even if we restrict JSE to rational judgments, some problems remain. For one thing, we need some explanation of why such a restricted thesis should be true. That is, we need an explanation of why JSE should be extensionally adequate in just the cases where it agrees with ESJ. The normative externalist, who believes in ESJ, has a simple explantion for that. JSE is extensionally adequate when and only when it agrees with ESJ because ESJ is generally true. It isn’t clear what could be a similarly good explanation of why a restricted version of JSE holds.

    Thinking through cases like Vieno’s can help motivate ESJ, and normative externalism more generally. There is something very strange about his case. On the one hand, the fact that a reliable person like Vieno believes that p should be some evidence for p. On the other hand, if Vieno still knows why he believes that p, still knows that is that E is the relevant evidence on which the belief was based, then believing that p still seems irrational. And that’s despite his knowing one important piece of evidence in favour of p, namely that he himself believes it.

    It’s important to distinguish the claims I’ve made in the last paragraph from what Gilbert Harman (1986) says about a slightly different case. Imagine that a month later, Vieno has forgotten the evidence that led to the belief that p, but nevertheless believes p. There are two interesting variants of this example. In one, p has been stored in preservative memory over that time. In the second, Vieno bases a new belief that p on the memory of believing p a month ago, plus his general reliability. If Vieno was under no obligation to retain the evidence for p, then it is plausible in the second case that the new belief that p is rational. And if the belief is rational in that case, maybe it is rational in the case where p was stored in preservative memory too.

    We’ve already discussed memory in some detail. Here i want to distinguish the following two kinds of cases. In one, Vieno has an apparent memory that p. In the other, he has a clear memory that E, and irrationally infers p from that. In the second, Vieno’s belief is irrational. But it is a mystery why this should be so, since he has this excellent evidence for p, from his own track record of success. ESJ explains this nicely, since that evidence is screened off. So the case of Vieno is both a problem for JSE, and a boon for ESJ. The case shows that JSE needs to be restricted, but it is hard to motivate any particular restriction. And ESJ offers a nice explanation of a puzzling fact, namely why Vieno’s track record is not in this case evidence for p.

    Now ESJ is a strongly externalist thesis. It says that facts about one’s own judgment are not evidentially relevant to what judgment one makes, provided one has access to the evidence behind that judgment. And that suggests that the judgment should really just be judged on how it tracks the evidence, which is what the externalist says.

    This point about laundering also offers a nice reply to a worry that I shouldn’t have drawn a commitment to JSE from the passages I quoted above from Egan, Elga, White and Christensen. Perhaps they are only committed to a weaker thesis, something like that JSE is true when mistakes have been made, or when the agent has good reason to believe mistakes have been made. I didn’t attribute such qualified theses to these epistemologists because the qualifications seem to make the theories worse. The qualified theories are still vulnerable to the regress arguments that we drew out of the examples involving Ariella. And the point about laundering shows that JSE is most plausible when it is restricted to cases where mistakes have not been made.

    Ariella’s example doesn’t just show that JSE is wrong. It gives us an extra reason to doubt the higher order hypothesis. If that hypothesis is true, then whatever prediction Ariella makes, she should raise her prediction as soon as she realises that she has made it. But that isn’t plausible, since it leads from a reasonable starting point, a prediction of 76 wins, to an incoherent conclusion. So the higher order hypothesis is false, and the challenge it poses to normative externalism does not succeed.

    11.7 Agents, States and Actions

    With this discussion of regresses completed, we are in a position to evaluate an interesting alternative to my account of cases like Riika’s. The alternative I’ll discuss here says that if Riika does nothing in response to learning the higher order evidence, her resultant belief is perfectly acceptable, but this shows something bad about her. I’m going to first motivate such an alternative view, then suggest that the regresses we’ve discussed in this chapter pose a problem for it.

    My account of Riika’s example is somewhat conciliatory. I say it could be right for Riika to change her credences, depending on just how the case is filled in. But there is much to be said for the less conciliatory view that the only rational belief for Riika to have is the one she started with. After all, that’s what her evidence supported, and she didn’t get any counter-evidence. So how do we explain the intuition that it would be bad to not change her mind? By postulating a break between the evaluations of Riika’s beliefs, on the one hand, and the evaluation of her actions, or of her, on the other.

    It will help to have some slightly stipulative language available to discuss the cases. When agent S forms the belief that p, we can evaluate that belief, and the formation of it, in a number of distinct ways. First, we can ask whether the belief is well supported by her evidence. Let’s say that the belief is evident if so, and not evident if not. Second, we can ask whether the belief is supported by the totality of her reasons to believe. Let’s say that the belief is rational if so, and irrational if not. Third, we can ask whether an epistemically virtuous agent would have formed that belief. Let’s say that the agent is wise if she is so virtuous at the time the belief is formed, and unwise if she is not.2 Fourth, and last for now, we can ask whether the practice the agent follows when forming the belief is one that she ought, all things considered, be following. Let’s say her practice is advisable if so, and inadvisable if not.

  • 2 In chapter 6 I noted that I’m using ‘wise’ for this kind of evaluation of agents, mostly following Maria (Lasonen-Aarnio 2010, 2014a), though changing the terminology slightly.

  • What’s crucial to evidentialism, as I conceive of it, is that the evident and the rational coincide. It does not commit itself on whether following the evidence is what wise agents do, or whether following the evidence is always advisable.

    Just as we can make this four-way distinction among beliefs, we can make a similar four-way distinction among actions. An agent looks at the evidence in favour of different decisions, and then takes a decision. We’ll assume, to simplify matters, that the agent has decent values in this process, so what’s at issue is how the agent’s doxastic system interacts with decisions to act. So we can describe actions as evident, rational, wise and advisable, with these terms having the same meanings as above.

    With all these distinctions in mind, we can take another look at the cases that motivate higher-order theories. Consider, for instance, Adam Elga’s example of the pilot who has evidence that it is possible they are suffering from hypoxia  (Elga 2008). Is it obvious that it is irrational for them to believe that they have enough fuel for the trip, as their evidence supports?

    Well, it does seem inadvisable for them to act as if they had enough fuel. But to get from premises about the the inadvisability of action to conclusions about the irrationality of belief requires a lot of steps. We could imagine reasoning as follows.

    1. It is inadvisable to act as if one had enough fuel.
    2. So, it is inadvisable to believe one has enough fuel.
    3. So, it is unwise to believe one has enough fuel.
    4. So, it is irrational to believe one has enough fuel.

    Put this bluntly, every step seems questionable. There could be distinct norms of action and belief. There could be distinct norms of advice and evaluation. And there could be distinct norms that apply at the level of agents to those that apply at the level of individual beliefs. Let’s look at these in order.

    Once we see that there are a lot of distinct ways we can think the pilot goes wrong, it is wrong to insist that it is simply intuitive that the pilot has as irrational belief. The intuition is that something has gone wrong with the pilot; what in particular has gone wrong is a matter for theory. And perhaps what is being intuited is not anything at all about belief, but something about action. Perhaps it would be bad in some way to act on one’s evidence, even if it would be rational to believe based on that evidence.

    Allan Coates (2012) has developed a form of this response to the examples that motivate internalist accounts of higher-order evidence. It isn’t just critics like Coates who have reacted in this way. Here is David Christensen making an argument that higher-order evidence matters to the rationality of belief.

    If you doubt that my confidence should be reduced, ask yourself whether I’d be reasonable in betting heavily on the correctness of my answer. Or consider the variant where my conclusion concerns the drug dosage for a critical patient, and ask yourself if it would be morally acceptable for me to write the prescription without getting someone else to corroborate my judgment. Insofar as I’m morally obliged to corroborate, it’s because the information about my being drugged should lower my confidence in my conclusion.  (Christensen 2010, 195)

    The thought for now is that the last line of this quote is simply false. There are all sorts of reasons it might be morally obligatory to corroborate even if the information about being drugged should not lower one’s confidence. It’s true that some forms of consequentialism about decision making will say that if confidence is not lowered, decisions should not change. But it is not at all compulsory to take a consequentialist attitude towards medical ethics. (Compare what Maria Lasonen-Aarnio (2014b, 430) says about rules governing the police.) And even if one is broadly consequentialist, Christensen’s conclusion still does not straightforwardly follow.

    We should take seriously the possibility that this is a case where agents should not change their credences, but should change how they act. Now that will be incoherent if you think that one should always maximise expected utility. But let’s consider the possibility that this is a case where maximising expected utility is not the thing to do. It’s a striking fact that the standard arguments for the propriety of maximising expected utility are almost always question-begging against the most interesting opponents  (Maher 1997; Weatherson 1999). Imagine a theorist who says that the right thing to do is to maximise expected expected utility, and run your favorite argument for the properiety of maximising expected utility against them. In most cases you’ll find at some stage you’re just begging the question against them. Consider, for instance, arguments based on representation theorems. These typically include as a premise that if the agent is choosing between two bets, and they have the same cost and same payoff, she should choose the bet that is the more probable winner. But this is just to assume that, in a special range of cases, she should maximise expected utility rather than expected expected utility, or anything else, and that assumption is, in this context, question-begging.

    I don’t mean this to be a serious argument against the view that we should maximise expected utility. Sometimes the best arguments for true positions are question-begging  (Lewis 1982). And a whole chapter of this book defends the claim that we can learn from circular arguments. Indeed I believe for independent reasons that we should maximise expected utility. But I do think it is worth thinking about the fact that the relevant intuitions about higher-order evidence seem in the first place to be intuitions about actions, and require some substantive assumptions to generate conclusions about beliefs.

    After all, if Riika should maximise expected expected utility, then she should order more tests, or get someone to confirm her diagnosis, before she acts. And that is true even if she actually has good evidence that the patient has dengue fever, as long as she lacks good evidence that she has good evidence. And perhaps that is what we are intuiting when we intuit that she should not act. The intuitions about the case, then are intuitions about action, but they don’t imply anything about belief without a substantive theory of the action-belief connection (i.e., that one should maximise expected utility), and that theory lacks independent support.

    This is a way of debunking the intuitions Christensen endorses about Riika‘s case. (And as noted many times, many other theorists have similar intuitions to Christensen’s about similar cases.) As it stands, I don’t accept this debunking story, because I accept the ’substantive theory of the action-belief connection’, but this is a commitment that goes beyond normative externalism, and the rejection of level-crossing principles.

    Let’s assume that that bridge has been crossed though, and we have reason (either intuition or argument) to believe it would be inadvisable for Riika to believe her patient has dengue fever. What follows? Nothing much, unless we assume a very tight connection between assessments of agents and assessments of states, or between assessments of strategies and assessments of states. And there are very good reasons to separating these assessments. Maria Lasonen-Aarnio (2010, 2014a) has argued for separating agent assessment from state assessment, and argued that the standard intuition here involves conflating the two. And John Hawthorne and Amia Srinivasan have argued for separating assessment of states from assessment of strategies for coming to those states  (Hawthorne and Srinivasan 2013).

    Hawthorne and Srinivasan’s argument is that these assessments come apart in general, so we should not be surprised if they come apart here. In general, it makes sense to distinguish between what someone should do in a particular circumstance, and what the person would do if they had instilled the habits that would be most effective in the long run. They give an example from sports. Their example involves tennis, but the idea generalises. Given the range of possible installable habits, it might be that the best habit to instil is one that will lead to expending valuable energy on occasionally chasing after lost causes. They are particularly interested in an epistemic limit on possible habits; the fact that we don’t always know what we know means that we can’t always react perfectly to our knowledge. But there are many possible limitations on possible habits due to our physical and cognitive limitations. And any one of these limitations will produce a gap between the optimal thing to do in a situation given one’s knowledge or evidence, and what would be done if one had installed the optimal habits.

    Now it may well be that the best habits we could have, given our cognitive nature, would involve second-guessing ourselves in cases like Riika‘s. Certainly if we think that our instincts involve some kind of ’optimism bias’  (Sharot 2012), then it will be advisable to instil habits to counteract that bias. And it is very plausible that the fact that someone did something because they were acting on the best habit they could have is largely excusing. (I would say it is completely excusing, but I’m a little sceptical that there are complete excuses.) It seems plausible that our norms of advice are tied more closely to the idea of what the best habits are to instil, rather than to what is best to do in each situation, so the thing to advise someone to do just is what they would do if they had the best possible habits.

    But all these facts should not obscure the fact that these are all second-best situations. Our cognitive and physical limitations mean that we sometimes cannot do what we should. That’s why they are called limitations. So there are cases where the best thing to believe is what the evidence supports, but it is understandable and excusable to regard the matter as unsettled. And the grounds for the excuse are that agent has the optimal habit for situations like this. But as theorists we should not ignore the fact that optimising habits is a second-best solution. Best would be to believe and confidently act. And it would be best to believe and act not because this would be a lucky guess, but because one has sufficient reason to act.

    So why didn’t I just say all this in Chapters 7 through 9 rather than going through long detours about evidence and circularity? One reason is that we still need to explain the distinction between Riika’s case and Raina’s, and I’m not sure going via thoughts about advisability, wisdom or action will help with that. (This isn’t a coy way of saying I think it won’t; it’s just that I haven’t yet worked out a way to make it help.)

    But a bigger reason is that we need to avoid the regresses. And the regresses suggest that policies like Adjust one’s credences to the higher-order evidence are actually not optimal habits. That would be a bad habit for Ariella to adopt. And it would be bad to advise Ariella to adjust her credences to her higher-order evidence. There is no sensible way for her to comply with that advice, and it is bad to give advice that cannot be sensibly complied with. And it would be bad to let higher-order evidence guide Ariella’s actions, since that would lead to betting on extreme results.

    So I think that a broadly evidentialist approach is the best way to explain the cases. But it is worthwhile to note that there are good reasons to reject level-crossing principles about act or state evaluation, while accepting them about agent evaluation. And such approaches might end up saying more radical things about particular cases than I say in thoroughly rejecting level-crossing principles.