The Bayesian and the Dogmatist


University of Michigan


August 1, 2007


It has been argued recently that dogmatism in epistemology is incompatible with Bayesianism. That is, it has been argued that dogmatism cannot be modelled using traditional techniques for Bayesian modelling. I argue that our response to this should not be to throw out dogmatism, but to develop better modelling techniques. I sketch a model for formal learning in which an agent can discover a posteriori fundamental epistemic connections. In this model, there is no formal objection to dogmatism.

There is a lot of philosophically interesting work being done in the borderlands between traditional and formal epistemology. It is easy to think that this would all be one-way traffic. When we try to formalise a traditional theory, we see that its hidden assumptions are inconsistent or otherwise untenable. Or we see that the proponents of the theory had been conflating two concepts that careful formal work lets us distinguish. Either way, the formalist teaches the traditionalist a lesson about what the live epistemological options are. I want to argue, more or less by example, that the traffic here should be two-way. By thinking carefully about considerations that move traditional epistemologists, we can find grounds for questioning some presuppositions that many formal epistemologists make.

To make this more concrete, I’m going to be looking at a Bayesian objection to a certain kind of dogmatism about justification. Several writers have urged that the incompatibility of dogmatism with a kind of Bayesianism is a reason to reject dogmatism. I rather think that it is reason to question the Bayesianism. To put the point slightly more carefully, there is a simple proof that dogmatism (of the kind I envisage) can’t be modelled using standard Bayesian modelling tools. Rather than conclude that dogmatism is therefore flawed, I conclude that we need better modelling tools. I’ll spend a fair bit of this paper on outlining a kind of model that (a) allows us to model dogmatic reasoning, (b) is motivated by the epistemological considerations that motivate dogmatism, and (c) helps with a familiar problem besetting the Bayesian.

I’m going to work up to that problem somewhat indirectly. I’ll start with looking at the kind of sceptical argument that motivates dogmatism. I’ll then briefly rehearse the argument that shows dogmatism and Bayesianism are incompatible. Then in the bulk of the paper I’ll suggest a way of making Bayesian models more flexible so they are no longer incompatible with dogmatism. I’ll call these new models dynamic Keynesian models of uncertainty. I’ll end with a brief guide to the virtues of my new kind of model.

1 Sceptical Arguments

Let H be some relatively speculative piece of knowledge that we have, say that G. E. Moore had hands, or that it will snow in Alaska sometime next year. And let E be all of our evidence about the external world. I’m not going to make many assumptions about what E contains, but for now E will stay fairly schematic. Now a fairly standard sceptical argument goes something like this. Consider a situation S in which our evidence is unchanged, but in which H is false, such as a brain-in-vat scenario, or a zombie scenario, or a scenario where the future does not resemble the past. Now a fairly standard sceptical argument goes something like this.

  1. To know H we have to be in a position to know we aren’t in S
  2. We aren’t in a position to know that we aren’t in S
  3. So, we don’t know H

There are a few immediate responses one could make, but which I’m going to dismiss without argument here. These include claiming the setup is incoherent (as in, e.g., Williamson (2000)), rejecting the closure principle behind premise 1 (as in, e.g., Dretske (1971)), accepting the conclusion (the sceptical response), or saying that in different sceptical arguments, one or other of these positions is correct. Instead I want to look at responses that question premise 2. In particular, I want to look at responses that offer us reasons to accept premise 2, since it seems here that the sceptic is at her strongest. (If the sceptic merely insists that premise 2 is reasonable, we can reply either that it isn’t, as I’m inclined to think, or that here is a case where intuition should be revised.)

Many epistemologists will write papers responding to ‘the sceptic’. I think this is a mistake, since there are so many different possible sceptics, each with different arguments for premise 2. (And, of course, some sceptics do not argue from sceptical scenarios like this one.) Here are, for instance, three arguments that sceptics might give for premise 2.

  1. Someone in S can’t discriminate her situation from yours.
  2. Indiscriminability is symmetric.
  3. If you can’t discriminate our situation from S, you can’t know you’re not in S.
  4. So you can’t know you’re not in S.
  1. Someone in S has the same evidence as you do.
  2. What you can know supervenes on what your evidence is.
  3. So, you can’t know you are not in S.
  1. There is no non-circular argument to the conclusion that you aren’t in S.
  2. If you were able to know you’re not in S, you would be able to produce a non-circular argument that concluded that you aren’t in S.
  3. So you can’t know that you aren’t in S.

I won’t say much about these arguments, save that I think in each case the second premise is very implausible. I suspect that most non-philosophers who are moved by sceptical arguments are tacitly relying on one or other of these arguments, but confirming that would require a more careful psychological study than I could do. But set those aside, because there’s a fourth argument that is more troubling. This argument takes its inspiration from what we might call Hume’s exhaustive argument for inductive scepticism. Hume said that we can’t justify induction inductively, and we can’t justify it deductively, and that exhausts the justifications, so we can’t justify induction. A similar kind of argument helps out the general sceptic.

  1. If you know you aren’t in S, you know this a priori, or a posteriori
  2. You can’t know you aren’t in S a posteriori
  3. You can’t know you aren’t in S a priori
  4. So, you can’t know you aren’t in S

This seems to be a really interesting argument to me. To make things simpler, I’ll stipulate that by a posteriori knowledge, I just mean knowledge that isn’t a priori. That makes the first premise pretty secure, as long as we’re assuming classical logic.1 Lots of philosophers take its third premise for granted. They assume that since it is metaphysically possible that you could be in S, this can’t be something you can rule out a priori. That strikes me as a rather odd capitulation to infallibilism. But I won’t push that here. Instead I’ll look at denials of the second premise.

  • 1 Perhaps not a wise assumption around here, but one that I’ll make throughout in what follows.

  • 2 Dogmatism and a Bayesian Objection

    Someone who denies the second premise says that your empirical evidence can provide the basis for knowing that you aren’t in S, even though you didn’t know this a priori. I’m going to call such a person a dogmatist, for reasons that will become clear shortly. The dogmatist is not a sceptic, so the dogmatist believes that you can know H. The dogmatist also believes a closure principle, so the dogmatist also believes you can know E ⊃ H. If the dogmatist thought you could know E ⊃ H a priori, they’d think that you could know a priori that you weren’t in S. (This follows by another application of closure.) But they think that isn’t possible, so knowing E ⊃ H a priori isn’t possible. Hence you know E ⊃ H a posteriori.

    If we reflect on the fact that E is your total evidence, then we can draw two conclusions. The first is that the dogmatist thinks that you can come to know H on the basis of E even though you didn’t know in advance that if E is true, then H is true. You don’t, that is, need antecedent knowledge of the conditional in order to be able to learn H from E. That’s why I’m calling them a dogmatist. The second point is that the dogmatist is now running head on into a piece of Bayesian orthodoxy.

    To see the problem, note that we can easily prove (A), for arbitrary E, H and K.2

  • 2 Again, the proof uses distinctively classical principles, in particular the equivalence of A with (A ∧ B) ∨ (A ∧ ¬B.) But I will take classical logic for granted throughout. David Jehle pointed out to me that the proof fails without this classical assumption.

  • (A)

    Pr(E ⊃ H | E ∧ K) ⩽ Pr(E ⊃ H | K), with equality iff Pr(E ⊃ H | E ∧ K) = 1


    \[ \begin{aligned} 1. & \Pr(E \supset H | K) = \Pr(E \supset H | E \wedge K)\Pr(E | K) + \Pr(E \supset H | \neg E \wedge K)\Pr(\neg E | K) && \text{Theorem} \\ 2. & \Pr(E \supset H | \neg E \wedge K) = 1 && \text{Logic} \\ 3. & \Pr(E \supset H | E \wedge K) \leq 1 && \text{Theorem} \\ 4. & \Pr(E \supset H | K) \geq \Pr(E \supset H | E \wedge K)\Pr(E | K) + \Pr(E \supset H | E \wedge K)\Pr(\neg E | K) && \text{1, 2, 3} \\ 5. & \Pr(E | K) + \Pr(\neg E | K) = 1 && \text{Theorem} \\ 6. & \Pr(E \supset H | K) \geq \Pr(E \supset H | E \wedge K) && \text{4, 5} \end{aligned} \]

    It is clear enough from the proof that line 6 is an equality iff line 3 is an equality, so we have proven (A). Now some authors have inferred from this something like (B) from (A).3

  • 3 Roger White (2006) and Stewart Cohen (2005) endorse probabilistic arguments against people who are, in my sense, dogmatists. John Hawthorne (2002) also makes a similar argument when arguing that certain conditionals, much like E ⊃ H, are a priori.

  • (B)
    It is impossible to go from not being in a position to know E ⊃ H to being in a position to know it just by receiving evidence E.

    The transition here should raise an eyebrow or two. (A) is a principle of probability statics. (B) is a principle of epistemological kinematics. To get from (A) to (B) we need a principle linking probability and epistemology, and a principle linking statics and kinematics. Fortunately, orthodox Bayesian confirmation theory offers us suggestions for both principles. We’ll write Cr(A) for the agent’s credence in A, and CrE(A) for the agent’s credence in A when updated by receiving evidence E.

    If CrE(A) ⩽ Cr(A), then it is impossible to go from not being in a position to know A to being in a position to know it just by receiving evidence E.
    CrE(A) = Cr(A | E). That is, learning goes by conditionalisation.

    A quick browse at any of the literature on Bayesian confirmation theory will show that these principles are both widely accepted by Bayesians. Philosophers, even Bayesians, make false assumptions, so neither of these principles is obviously true. Nevertheless, I’m going to accept Learning at least for the sake of argument. I’m going to argue instead that the inference from (A) to (B) fails because Bayes fails. That is, I’m going to accept that if we could prove a principle I’ll call Lower is true, then dogmatism in the sense I’m defending it fails.

    CrE(E ⊃ H) is less than or equal to Cr(E ⊃ H).

    Now there is a bad argument around here that the dogmatist might make. It might be argued that since the Bayesian approach (including Bayes) involves so much idealisation it could not be applicable to real agents. That’s a bad argument because the Bayesian approach might provide us with a good model for real agents, and models can be useful without being scale models. As long as the Bayesian model is the most appropriate model in the circumstances, then we can draw conclusions for the real world from facts about the model. The problem arises if there are alternative models which seem to fit just as well, but in which principles like Lower are not true. If there are alternative models that seem better suited (or at least just as well suited) to modelling the situation of initial evidence acquisition, and those models do not make Lower true, then we might think the derivation of Lower in the Bayesian model is a mere consequence of the arbitrary choice of model. In the next section I will develop just such a model. I won’t argue that it is the best model, let alone the only alternative to the Bayesian model. But I will argue that it is as good for these purposes as the Bayesian model, and it does not imply Lower.

    3 Bayes and Keynes

    The traditional Bayesian model of a rational agent starts with the following two principles.

    • At any moment, the agent’s credal states are represented by a probability function.
    • From moment to moment, the agent’s credal states are updated by conditionalisation on the evidence received.

    Over recent decades many philosophers have been interested in models that relax those assumptions. One particular model that has got a lot of attention (from e.g. Isaac Levi (1974, 1980), Richard Jeffrey (1983), Bas Fraassen (1990), Alan Hajek (2000, 2003) and many others) is what I’ll call the static Keynesian model. This model has the following features.

    • At any moment, the agent’s credal states are represented by a set of probability functions, called their representor.
    • The agent holds that p is more probable than q iff the probability of p is greater than the probability of q according to all probability functions in their representor. The agent holds that p and q are equally probable iff the probability of p is equal to the probability of q according to all probability functions in their representor.
    • From moment to moment, the agent’s credal states are updated by conditionalising each of the functions in the representor on the evidence received.

    The second point is the big attraction. It allows that the agent need not hold that p is more probable than q, or q more probable than p, or that p and q are equally probable, for arbitrary p and q. And that’s good because it isn’t a rationality requirement that agents make pairwise probability judgments about all pairs of propositions. Largely because of this feature, I argued in an earlier paper that this model could be use to formalise the key philosophical ideas in Keynes’s Treatise on Probability. That’s the reason I call this a ‘Keynesian’ model.

    The modifier ‘static’ might seem a little strange, because the agent’s representor does change when she receives new evidence. But the change is always of a certain kind. Her ‘hypothetical priors’ do not change. If at t1 her evidence is E1 and her representor R1, and at t2 her evidence is E2 and her representor R2, then there is a ‘prior’ representor R0 such that the following two claims are true for all probability functions Pr.

    • PrR1 ↔︎ [∃Pr0R0: ∀p (Pr(p) = Pr0(p | E1))]
    • PrR2 ↔︎ [∃Pr0R0: ∀p (Pr(p) = Pr0(p | E2))]

    That is, there is a set of probability functions such that the agent’s representor at any time is the result of conditionalising each of those functions on her evidence. I’ll call any model with this property a static model, so the model described above is the static Keynesian model.

    Now there is a lot to like about the static Keynesian model, and I have made extensive use of it previous work. It is a particularly useful model to use when we need to distinguish between risk and uncertainty in the sense that these terms are used in Keynes’s 1937 article “The General Theory of Employment”.4 The traditional Bayesian model assumes that all propositions are risky, but in real life some propositions are uncertain as well, and in positions of radical doubt, where we have little or no empirical evidence, presumably most propositions are extremely uncertain. And using the static Keynesian model does not mean we have to abandon the great work done in Bayesian epistemology and philosophy of science. Since a Bayesian model is a (degenerate) static Keynesian model, we can say that in many circumstances (namely circumstances where uncertainty can be properly ignored) the Bayesian model will be appropriate. Indeed, these days it is something like a consensus among probabilists or Bayesians that the static Keynesian model is a useful generalisation of the Bayesian model. For example in Christensen (2005) it is noted, almost as an afterthought, that the static Keynesian model will be more realistic, and hence potentially more useful, than the traditional Bayesian model. Christensen doesn’t appear to take this as any kind of objection to Bayesianism, and I think this is just the right attitude.

  • 4 The clearest statement of the distinction that I know is from that paper.

    By ‘uncertain’ knowledge, let me explain, I do not mean merely to distinguish what is known for certain from what is only probable. The game of roulette is not subject, in this sense, to uncertainty; nor is the prospect of a Victory bond being drawn. Or, again, the expectation of life is only slightly uncertain. Even the weather is only moderately uncertain. The sense in which I am using the term is that in which the prospect of a European war is uncertain, or the price of copper and the rate of interest twenty years hence, or the obsolescence of a new invention, or the position of private wealth owners in the social system in 1970. About these matters there is no scientific basis on which to form any calculable probability whatever. We simply do not know. Nevertheless, the necessity for action and decision compels us as practical men to do our best to overlook this awkward fact and to behave exactly as we should if we had behind us a good Benthamite calculation of a series of prospective advantages and disadvantages, each multiplied by its appropriate probability, waiting to be summed. (Keynes 1937, 114–15)

  • But just as the static Keynesian is more general than the Bayesian model, there are bound to be interesting models that are more general than the static Keynesian model. One such model is what I call the dynamic Keynesian model. This model has been used by Seth Yalcin to explicate some interesting semantic theories, but to the best of my knowledge it has not been used for epistemological purposes before. That should change. The model is like the static Keynesian model in its use of representors, but it changes the way updating is modelled. When an agent with representor R receives evidence E, she should update her representor by a two step process.

    • Replace R with U(R, E)
    • Conditionalise U(R, E), i.e. replace it with {Pr(E): Pr is in U(R, E)}

    In this story, U is a function that takes two inputs: a representor and a piece of evidence, and returns a representor that is a subset of the original representor. Intuitively, this models the effect of learning, via getting evidence E, what evidential relationships obtain. In the static Keynesian model, it is assumed that before the agent receives evidence E, she could already say which propositions would receive probabilistic support from E. All of the relations of evidential support were encoded in her conditional probabilities. There is no place in the model for learning about fundamental evidential relationships. In the dynamic Keynesian model, this is possible. When the agent receives evidence E, she might learn that certain functions that were previously in her representor misrepresented the relationship between evidence and hypotheses, particularly between evidence E and other hypotheses. In those cases, U(R, E) will be her old representor R, minus the functions that E teaches her misrepresent these evidential relationships.

    The dynamic Keynesian model seems well suited to the dogmatist, indeed to any epistemological theory that allows for fundamental evidential relationships to be only knowable a posteriori. As we’ll see below, this is a reason to stop here in the presentation of the model and not try and say something systematic about the behaviour of U. Instead of developing the model by saying more about U, we should assess it, which is what I’ll do next.

    4 In Defence of Dynamism

    In this section I want go over three benefits of the dynamic Keynesian model, and then say a little about how it relates to the discussion of scepticism with which we opened. I’m not going to say much about possible objections to the use of the model. That’s partially for space reasons, partially because what I have to say about the objections I know of is fairly predictable, and partially because the model is new enough that I don’t really know what the strongest objections might be. So here we’ll stick to arguments for the view.

    4.1 The Dogmatist and the Keynesian

    The first advantage of the dynamic Keynesian model is that because it does not verify Lower, it is consistent with dogmatism. Now if you think that dogmatism is obviously false, you won’t think this is much of an advantage. But I tend to think that dogmatism is one of the small number of not absurd solutions to a very hard epistemological problem with no obvious solution, so we should not rule it out pre-emptively. Hence I think our formal models should be consistent with it. What is tricky is proving that the dynamic Keynesian model is indeed consistent with it.

    To see whether this is true on the dynamic Keynesian model, we need to say what it is to lower the credence of some proposition. Since representors map propositions onto intervals rather than numbers, we can’t simply talk about one ‘probability’ being a smaller number than another.5 On the static Keynesian model, the most natural move is to say that conditionalisation on E lowers the credence of p iff for all Pr in the representor, Pr(p) > Pr(p | E). This implies that if every function in the representor says that E is negatively relevant to p, then conditionalising on E makes p less probable. Importantly, it allows this even if the values that Pr(p) takes across the representor before and after conditionalisation overlap. So what should we say on the dynamic Keynesian model? The weakest approach that seems viable, and not coincidentally the most plausible approach, is to say that updating on E lowers the credence of p iff the following conditions are met:

  • 5 Strictly speaking, the story I’ve told so far does not guarantee that for any proposition p, the values that Pr(p) takes (for Pr in the representor) form an interval. But it is usual in more detailed presentations of the model to put constraints on the representor to guarantee that happens, and I’ll assume we’ve done that.

    • For all Pr in U(R, E), Pr(p | E) < Pr(p)
    • For all Pr in R but not in U(R, E), there is a Pr′ in U(R, E) such that Pr′(p | E) < Pr(p)

    It isn’t too hard to show that for some models, updating on E does not lower the credence of E ⊃ H, if lowering is understood this way. The following is an extreme example, but it suffices to make the logical point. Let R be the minimal representor, the set of all probability functions that assign probability 1 to a priori certainties. And let U(R, E) be the singleton of the following probability function, defined only over Boolean combinations of E and H: Pr(E ∧ H) = Pr(E ∧¬H) = PrE ∧ H) = PrE ∧¬H) = ¼. Then the probability of E ⊃ H after updating is ¾. (More precisely, according to all Pr in U(R, E), Pr(E ⊃ H) = ¾.) Since before updating there were Pr in R such that Pr(E ⊃ H) < ¾, in fact there were Pr in R such that Pr(E ⊃ H) = 0, updating on E did not lower the credence of E ⊃ H. So the dynamic Keynesian model does not, in general, have as a consequence that updating on E lowers the credence of E ⊃ H. This suggests that Lower in general is not true.

    It might be objected that if evidence E supports our knowledge that E ⊃ H, then updating on E should raise the credence of E ⊃ H. And if we define credence raising the same way we just defined credence lowering, updating on E never raises the credence of E ⊃ H. From a Keynesian perspective, we should simply deny that evidence has to raise the credence of the propositions known on the basis of that evidence. It might be sufficient that getting this evidence removes the uncertainty associated with those propositions. Even on the static Keynesian model, it is possible for evidence to remove uncertainty related to propositions without raising the probability of that proposition. A little informally, we might note that whether an agent with representor R is sufficiently confident in p to know that p depends on the lowest value that Pr(p) takes for PrR, and updating can raise the value of this ‘lower bound’ without raising the value of Pr(p) according to all functions in R, and hence without strictly speaking raising the credence of p.

    The above illustration is obviously unrealistic, in part because U could not behave that way. It’s tempting at this stage to ask just how U does behave so we can work out if there are more realistic examples. Indeed, it’s tempting to try to attempt to provide a formal description of U. This temptation should be resisted. The whole point of the model is that we can only learn which hypotheses are supported by certain evidence by actually getting that evidence. If we could say just what U is, we would be able to know what was supported by any kind of evidence without getting that evidence. The best we can do with respect to U is to discover some of its contours with respect to evidence much like our own. And the way to make those discoveries will be to do scientific and epistemological research. It isn’t obvious that, say, looking for nice formal properties of U will help at all.

    4.2 The Problem of the Priors

    One really nice consequence of the dynamic Keynesian approach is that it lets us say what the representor of an agent with no empirical information should be. Say a proposition is a priori certain iff it is a priori that all rational agents assign credence 1 to that proposition. Then the representor of the agent with no empirical evidence is {Pr: ∀p: If p is a priori certain, then Pr(p) = 1}. This is the minimal representor I mentioned above. Apart from assigning probability 1 to the a priori certainties, the representor is silent. Hence it treats all propositions that are not a priori certain in exactly the same way. This kind of symmetric treatment of propositions is not possible on the traditional Bayesian conception for logical reasons. (The reasons are set out in the various discussions of the paradoxes of indifference, going back to Bertrand (1888).) Such a prior representor is consistent with the static Keynesian approach, but it yields implausible results, since conditionalising on E has no effect on the distribution of values of Pr(p) among functions in the representor for any p not made a priori certain by E. (We’ll say p is made a priori certain by E iff Ep is a priori certain.) So if this is our starting representor, we can’t even get probabilistic evidence for things that are not made certain by our evidence.6 So on the static Keynesian model, this attractively symmetric prior representor is not available.

  • 6 The argument in the text goes by a little quickly, because I’ve defined representors in terms on unconditional probabilities and this leads to complications to do with conditionalising on propositions of zero probability. A better thing to do, as suggested by Hájek (2003), is to take conditional probability as primitive. If we do this we’ll define representors as sets of conditional probability functions, and the a priori representor will be {Pr: If pq is a priori certain, then Pr(q p) = 1}. Then the claim in the text will follow.

  • I think one of the motivations of anti-dogmatist thinking is the thought that we should be able to tell a priori what is evidence for what. If it looking like there is a cow in front of us is a reason to think there is a cow in front of us, that should be knowable a priori. I think the motivation for this kind of position shrinks a little when we realise that an a priori prior that represented all the connections between evidence and hypotheses would have to give us a lot of guidance as to what to do (epistemically speaking) in worlds quite unlike our own. Moreover, there is no reason we should have lots that information. So consider, for a minute, a soul in a world with no spatial dimensions and three temporal dimensions, where the primary source of evidence for souls is empathic connection with other souls from which they get a (fallible) guide to those souls’ mental states. When such a soul conditionalises on the evidence “A soul seems to love me” (that’s the kind of evidence they get) what should their posterior probability be that there is indeed a soul that loves them? What if the souls have a very alien mental life, so they instantiate mental concepts very unlike our own, and souls get fallible evidence of these alien concepts being instantiated through empathy? I think it’s pretty clear we don’t know the answers to these questions. (Note that to answer this question we’d have to know which of these concepts were grue-like, and which were projectable, and there is no reason to believe we are in a position to know that.) Now those souls are presumably just as ignorant about the epistemologically appropriate reaction to the kinds of evidence we get, like seeing a cow or hearing a doorbell, as we are about their evidence. The dynamic Keynesian model can allow for this, especially if we use the very weak prior representor described above. When we get the kind of evidence we actually get, the effect of U is to shrink our representors to sets of probability functions which are broadly speaking epistemically appropriate for the kind of world we are in. Before we got that evidence, we didn’t know how we should respond to it, just like the spaceless empathic souls don’t know how to respond to it, just like we don’t know how to respond to their evidence.

    It is a commonplace observation that (a) prior probabilities are really crucial in Bayesian epistemology, but (b) we have next to no idea what they look like. I call this the problem of the priors, and note with some satisfaction that the dynamic Keynesian model avoids it. Now a cynic might note that all I’ve done is replace a hand-wavy story about priors with a hand-wavy story about updating. That’s true, but nevertheless I think this is progress. The things I’m being deliberately unclear about, such as what U should look like for E such as “Some other non-spatial tri-temporal soul seems to love me” are things that (a) my theory says are not a priori knowable, and (b) I don’t have any evidence concerning. So it isn’t surprising that I don’t have much to say about them. It isn’t clear that the traditional Bayesian can offer any story, even by their own lights, as to why they are less clear about the structure of the prior probability conditional on such an E.

    4.3 The Problem of Old Evidence

    When we get evidence E, the dynamic Keynesian model says that we should do two things. First, we should throw out some probability functions in our representor. Second, we should conditionalise those that remain. But this is a normative condition, not a description of what actually happens. Sometimes, when we get evidence E, we may not realise that it is evidence that supports some theory T. That is, we won’t sufficiently cull the representor of those probability functions where the probability of T given E is not high. Housecleaning like this is hard, and sometimes we only do it when it becomes essential. In this case, that means we only do it when we start paying serious attention to T. In that case we may find that evidence E, evidence we’ve already incorporated, in the sense of having used in conditionalisation, gives us reason to be more confident than we were in T. In such a case we’ll simply cull those functions where probability of T given E is not high, and we will be more confident in T. That’s how old evidence can be relevant on the dynamic Keynesian model. Since we have a story about how old evidence can be relevant, there is no problem of old evidence for the dynamic Keynesian.

    Famously, there is a problem of old evidence for traditional Bayesians. Now I’m not going to rehearse all the arguments concerning this problem to convince you that this problem hasn’t been solved. That’s in part because it would take too long and in part because I’m not sure myself that it hasn’t been solved. But I will note that if you think the problem of old evidence is a live problem for traditional Bayesians, then you have a strong reason for taking the dynamic Keynesian model seriously.

    4.4 Why Should We Care?

    The sceptic’s opening move was to appeal to our intuition that propositions like E ⊃ H are unknowable. We then asked what reasons we could be given for accepting this claim, because the sceptic seems to want to derive quite a lot from a raw intuition. The sceptic can respond with a wide range of arguments, four of which are mentioned above. Here we focussed on the sceptic’s argument from exhaustion. E ⊃ H isn’t knowable a priori, because it could be false, and it isn’t knowable a posteriori, because, on standard models of learning, our evidence lowers its credibility. My response is to say that this is an artefact of the model the sceptic (along with everyone else) is using. There’s nothing wrong with using simplified models, in fact it is usually the only way to make progress, but we must be always wary that our conclusions transfer from the model to the real world. One way to argue that a conclusion is a mere artefact of the model is to come up with a model that is sensitive to more features of reality in which the conclusion does not hold. That’s what I’ve done here. The dynamic Keynesian model is sensitive to the facts that (a) there is a distinction between risk and uncertainty and (b) we can learn about fundamental evidential connections. In the dynamic Keynesian model, it isn’t true that our evidence lowers the probability of E ⊃ H. So the anti-sceptic who says that E ⊃ H is knowable a posteriori, the person I’ve called the dogmatist, has a defence against this Bayesian argument. If the response is successful, then there may well be other applications of the dynamic Keynesian model, but for now I’m content to show how the model can be used to defend the dogmatic response to scepticism.[^7]


    Bertrand, Joseph Louis François. 1888. Calcul Des Probabilités. Paris: Gauthier-Villars et fils.
    Christensen, David. 2005. Putting Logic in Its Place. Oxford: Oxford University Press.
    Cohen, Stewart. 2005. “Why Basic Knowledge Is Easy Knowledge.” Philosophy and Phenomenological Research 70 (2): 417–30. doi: 10.1111/j.1933-1592.2005.tb00536.x.
    Dretske, Fred. 1971. “Conclusive Reasons.” Australasian Journal of Philosophy 49 (1): 1–22. doi: 10.1080/00048407112341001.
    Fraassen, Bas van. 1990. “Figures in a Probability Landscape.” In Truth or Consequences, edited by J. M. Dunn and A. Gupta, 345–56. Amsterdam: Kluwer.
    Hájek, Alan. 2000. “Objecting Vaguely to Pascal’s Wager.” Philosophical Studies 98: 1–16. doi: 10.1023/A:1018329005240.
    ———. 2003. “What Conditional Probability Could Not Be.” Synthese 137 (3): 273–323. doi: 10.1023/B:SYNT.0000004904.91112.16.
    Hawthorne, John. 2002. “Deeply Contingent a Priori Knowledge.” Philosophy and Phenomenological Research 65 (2): 247–69. doi: 10.1111/j.1933-1592.2002.tb00201.x.
    Jeffrey, Richard. 1983. “Bayesianism with a Human Face.” In Testing Scientific Theories, edited by J. Earman (ed.). Minneapolis: University of Minnesota Press.
    Keynes, John Maynard. 1937. “The General Theory of Employment.” Quarterly Journal of Economics 51 (2): 209–23. doi: 10.2307/1882087. Reprinted in , references to reprint.
    Levi, Isaac. 1974. “On Indeterminate Probabilities.” Journal of Philosophy 71 (13): 391–418. doi: 10.2307/2025161.
    ———. 1980. The Enterprise of Knowledge. Cambridge, MA.: MIT Press.
    White, Roger. 2006. “Problems for Dogmatism.” Philosophical Studies 131 (3): 525–57. doi: 10.1007/s11098-004-7487-9.
    Williamson, Timothy. 2000. Knowledge and its Limits. Oxford University Press.


    BibTeX citation:
      author = {Weatherson, Brian},
      title = {The {Bayesian} and the {Dogmatist}},
      journal = {Proceedings of the Aristotelian Society},
      volume = {107},
      number = {2},
      pages = {169-185},
      date = {2007-08},
      url = {},
      doi = {10.1111/j.1467-9264.2007.00217.x},
      langid = {en},
      abstract = {It has been argued recently that dogmatism in epistemology
        is incompatible with Bayesianism. That is, it has been argued that
        dogmatism cannot be modelled using traditional techniques for
        Bayesian modelling. I argue that our response to this should not be
        to throw out dogmatism, but to develop better modelling techniques.
        I sketch a model for formal learning in which an agent can discover
        a posteriori fundamental epistemic connections. In this model, there
        is no formal objection to dogmatism.}