From Classical to Intuitionistic Probability

logic games and decisions

We generalize the Kolmogorov axioms for probability calculus to obtain conditions defining, for any given logic, a class of probability functions relative to that logic, coinciding with the standard probability functions in the special case of classical logic but allowing consideration of other classes of “essentially Kolmogorovian” probability functions relative to other logics. We take a broad view of the Bayesian approach as dictating inter alia that from the perspective of a given logic, rational degrees of belief are those representable by probability functions from the class appropriate to that logic. Classical Bayesianism, which fixes the logic as classical logic, is only one version of this general approach. Another, which we call Intuitionistic Bayesianism, selects intuitionistic logic as the preferred logic and the associated class of probability functions as the right class of candidate representions of epistemic states (rational allocations of degrees of belief). Various objections to classical Bayesianism are, we argue, best met by passing to intuitionistic Bayesianism – in which the probability functions are taken relative to intuitionistic logic – rather than by adopting a radically non-Kolmogorovian, e.g. non-additive, conception of (or substitute for) probability functions, in spite of the popularity of the latter response amongst those who have raised these objections. The interest of intuitionistic Bayesianism is further enhanced by the availability of a Dutch Book argument justifying the selection of intuitionistic probability functions as guides to rational betting behaviour when due consideration is paid to the fact that bets are settled only when/if the outcome betted on becomes known.

Brian Weatherson http://brian.weatherson.org (University of Michigan)https://umich.edu
September 1 2001

Introduction

It is a standard claim of modern Bayesian epistemology that reasonable epistemic states should be representable by probability functions. There have been a number of authors who have opposed this claim. For example, it has been claimed that epistemic states should be representable by Zadeh’s fuzzy sets, Dempster and Shafer’s evidence functions, Shackle’s potential surprise functions, Cohen’s inductive probabilities or Schmeidler’s non-additive probabilities.1 A major motivation of these theorists has been that in cases where we have little or no evidence for or against \(p\), it should be reasonable to have low degrees of belief in each of \(p\) and \({\lnot}\)\(p\), something apparently incompatible with the Bayesian approach. There are two broad types of response to this situation, the second of which shows the incompatibility just mentioned is more apparent than real. The first of these – much in evidence in the work of the writers just cited – is to replace or radically reconstrue the notion of probability taken by that approach to represent degrees of belief. The second – to be defended here – seeks to maintain the core of standard probability theory but to generalize the notion of a probability function to accommodate variation in the background logic of the account; this allows us to respond to such issues as the low degree of belief in a proposition and its negation by simply weakening the background logic from classical to intuitionistic logic. Thus if Bayesianism is construed as in our opening sentence, one way to respond to the objections of the heterodox writers listed above is to trade in classical Bayesianism for intuitionistic Bayesianism. Since for many theorists at least the motivation for their opposition to Bayesianism is grounded in either verificationism or anti-realism, a move to a intuitionistic theory of probability seems appropriate. Indeed, as Harman (1983) notes, the standard analysis of degrees of belief as dispositions to bet leads naturally to a intuitionistic theory of probability. We give a Dutch Book argument in defence of constructive Bayesianism in Section 4 below.

The appropriate generalization of the notion of a probability function makes explicit allowance for a sensitivity to the background logic. The latter we identify with a consequence relation, such as, in particular, the consequence relation \(\vdash_{CL}\) associated with classical logic or the consequence relation \(\vdash_{IL}\) associated with intuitionistic logic. To keep things general, we assume only that the languages under discussion have two binary connectives: \({\vee}\) and \({\wedge}\). No assumptions are made about how a consequence relation on such a language treats compounds formed using these connectives, though of course in the cases in which we are especially interested, \(\vdash_{CL}\) and \(\vdash_{IL}\), such compounds have the expected logical properties. We take the language of these two consequences relations to be the same, assuming in particular that negation (\({\lnot}\)) is present for both. Finally, if \(A\) belongs to the language of a consequence relation \(\vdash\), then we say that \(A\) is a \(\vdash\)-thesis of \(\vdash\) \(A\) and that \(A\) is a \(\vdash\)-antithesis if for all \(B\) in that language \(A\) \(\vdash\) \(B\). (Thus the \(\vdash\)-theses and antitheses represent the logical truths and logical falsehoods as seen from the perspective of \(\vdash\).) We are now in a position to give the key definition.

If \(\vdash\) is a consequence relation, then a function Pr mapping the language of \(\vdash\) to the real interval [0,1] is a \(\vdash\)-probability function if and only if the following conditions are satisfied:

(P0)

Pr(\(A\)) = 0 if \(A\) is a \(\vdash\)-antithesis.

(P1)

Pr(\(A\)) = 1 if \(A\) is a \(\vdash\)-thesis

(P2)

If \(A\) \(\vdash\) \(B\) then Pr(\(A\)\({\leq}\) Pr(\(B\))

(P3)

Pr(\(A\)) + Pr(\(B\)) = Pr(\(A\) \({\vee}\) \(B\)) + Pr(\(A\) \({\wedge}\) \(B\))

If \(\vdash\) is \(\vdash_{CL}\), then we call a \(\vdash\)-probability function a classical probability function; if \(\vdash\) is \(\vdash_{IL}\) we call a \(\vdash\)-probability function an intuitionistic probability function. The position described above as constructive Bayesianism would replace classical probability functions by intuitionistic probability functions as candidate representations of reasonable epistemic states. Note that classical probability functions in this sense are exactly those obeying the standard probability calculus axioms. In paricular, the familiar negation axiom dictating that Pr(\({\lnot}\)\(A\)) = 1 – Pr(\(A\)) emerges as a by-product of the interaction between the general (i.e., logic-independent) condition (P3) and, via (P0) and (P1), the logic-specific facts that \(A\) \({\wedge}\) \({\lnot}\)\(A\) is a \(\vdash_{CL}\)-antithesis and \(A\) \({\vee}\) \({\lnot}\)\(A\) is a \(\vdash_{CL}\)-thesis for any \(A\).

Although it is these two kinds – intuitionistic and classical – of probability functions we shall be dealing with specifically in what follows, we emphasize the generality of the above definition of a \(\vdash\)-probability function, and invite the reader to consider what effect further varying the choice of \(\vdash\) has on the behaviour of such functions. Our attention will be on the comparative merits of \(\vdash_{CL}\) and \(\vdash_{IL}\) in this regard. (It may have occurred to the reader in connection with (P3) above that we might naturally have considered a generalized version of (P3) for ‘countable additivity.’ Whether such a condition ought be adopted will turn on some rather difficult questions concerning the use of infinities in constructive reasoning; let us leave it as a question for further research. We have stated (P3) in its finitary form so as not to require that intuitionistic probability functions satisfy the more contentious general condition.)

In the following section we shall review some of the motivations for intuitionistic Bayesianism. The arguments are rather piecemeal; they are designed to show that given the philosophical commitments various writers in the field have expressed they would be better off taking this route, i.e., focussing on the class of intuitionistic probability functions, than – as many of them have suggested –abandoning Bayesianism in our broad sense. In particular, we shall urge that moves in the latter direction which involve abandoning (what we shall call) the Principle of Addition are seriously undermotivated.

One aspect of the Bayesian perspective which we have not considered concerns the dynamics rather than the statics of epistemic states: in particular the idea that changes in such states are governed for rational agents by the principle of conditionalizing on new information. This requires that we have a dyadic functor available for expressing conditional probabilities. Accordingly, where Pr is for some consequence relation \(\vdash\) a \(\vdash\)-probability function, we favour the standard account and take the associated conditional \(\vdash\)-probability function Pr( , ) to be given by Pr(\(A\),\(B\)) = Pr(\(A\) \({\wedge}\) \(B\))/Pr(\(B\)) when Pr(\(B\)) \({\neq}\) 0, with Pr(\(A\),\(B\)) undefined when Pr(\(B\)) = 0. The intention, of course, is that Pr(\(A\),\(B\)) represents the conditional probability of \(A\) given \(B\). We defer further consideration of conditional probability until the Appendix.

Motivating Intuitionistic Bayesianism

There are four main reasons for grounding preferring intuitionistic over classical probability functions as representing the range of reasonable epistemic states. These are: (1) a commitment to verificationism, (2) a commitment to anti-realism, (3) preservation of the principle of Addition, and (4) avoidance of direct arguments for the orthodox approach. Now some of these will be viewed by some people as bad reasons for adopting the given position, a reaction with which it is not hard to sympathise. In particular, the verificationist and anti-realist elements of the theory might well be viewed as negatives. These arguments are principally directed at showing that by their own lights, various opponents of classical Bayesianism would do better to adopt the intuitionistic Bayesian position than some still more heterodox non-Bayesian account.

2.1 A standard objection to classical Bayesianism is that it has no way of representing complete uncertainty. Because of the failures of Laplace’s principle of indifference, it can’t be said that uncertainty about \(p\) is best represented by assigning credence 1/2 to \(p\). Heterodox approaches usually allow the assignment of credence 0 to both \(p\) and \({\lnot}\)\(p\) when an agent has no evidence at all as to whether or not \(p\) is true. Because these approaches generally require an agent to assign credence 1 to classical tautologies, including \(p\) \({\vee}\) \({\lnot}\)\(p\), these theories must give up the following Principle of Addition.

Addition

For incompatible \(A\), \(B\): Bel(\(A\) \({\vee}\) \(B\)) = Bel(\(A\)) + Bel(\(B\)).

Bel(\(A\))” is here used to mean the degree of belief the agent has in \(A\), and “incompatible” to apply to \(A\) and \(B\) in which for some favoured consequence relation \(\vdash\), the conjunction of \(A\) with \(B\) is a \(\vdash\)-antithesis. Such conditions as Addition are of course taken not as descriptive theories about all agents, since irrational agents would serve as counterexamples. Rather, they are proposed coherence constraints on all rational agents.

The Principle of Addition is stated in terms of degrees of belief, or credences. Where no ambiguity results we also use the same term to refer to the corresponding principle applied to \(\vdash\)-probability functions, with incompatibility understood in terms of \(\vdash\) (as just explained). Now in some writings (particularly Shafer’s) the reason suggested for giving up Addition is openly verificationist. Shafer says that when an agent has no evidence for \(p\), they should assign degree of belief 0 to \(p\). Degrees of belief, under this approach, must be proportional to evidence.2 In recent philosophical literature, this kind of verificationism is often accompanied by an insistence that validity of arguments be judged by the lights of \(\vdash_{IL}\) rather than \(\vdash_{CL}\).

A similar line of thought is to be found in Harman (1983). He notes that when we don’t distinguish between the truth conditions for a sentence and its assertibility conditions, the appropriate logic is intuitionistic. And when we’re considering gambles, something like this is correct. When betting on \(p\) we don’t, in general, care if \(p\) is true as opposed to whether it will be discovered that \(p\) is true. A \(p\)-bet, where \(p\) asserts the occurrence of some event for instance, becomes a winning bet, not when that event occurs, but when \(p\) becomes assertible. So perhaps not just verificationists like Shafer, but all those who analyse degrees of belief as propensity to bet should adopt constructivist approaches to probability.

To see the point Harman is making, consider this example. We are invited to quote for \(p\)-bets and \({\lnot}\)\(p\)-bets, where \(p\) is O. J. Simpson murdered his wife. If we are to take the Californian legal system literally, the probability of that given the evidence is strictly between one-half and one. To avoid one objection, these bets don’t just pay $1 if the bettor guesses correctly. Rather they pay $1 invested at market rates of interest at the time the bet is placed. The idea is that if we pay x cents for the bet now, when it is discovered that we have bet correctly we will receive a sum of money that is worth exactly as much as $1 now. Still, we claim, it might be worthwhile to quote less than 50 cents for each of the bets. Even if we will receive $1 worth of reward if we wager correctly, there is every possibility that we’ll never find out. So it might be that placing a bet would be a losing play either way. To allow for this, the sum of our quotes for the \(p\)-bet and the \({\lnot}\)\(p\)-bet may be less than $1. As Harman points out, to reply by wielding a Dutch Book argument purporting to show that this betting practice is incoherent would be blatantly question-begging. That argument simply assumes that \(p\) \({\vee}\) \({\lnot}\)\(p\) is a logical truth, which is presumably part of what’s at issue. (In our terminology, this disjunction has the status of a \(\vdash_{CL}\)-thesis which is not a \(\vdash_{IL}\)-thesis.)

Harman’s point is not to argue for a intuitionistic approach to probability. Rather, he is arguing against using probabilistic semantics for propositional logic. Such an approach he claims would be bound to lead to intuitionistic logic for the reasons given above. He thinks that, since this would be an error, the move to probabilistic semantics is simply misguided. Whatever we think of this conclusion, we can press into service his arguments for intuitionistic Bayesianism.

2.2 The second argument for this approach turns on the anti-realism of some heterodox theorists. So George Shackle, for example, argues that if we are anti-realists about the future, we will assign positive probability to no future-directed proposition. The following summary is from a sympathetic interpreter of Shackle’s writing.

[T]here is every reason to refuse additivity: [it] implies that the certainty that would be assigned to the set of possibilities should be ‘distributed’ between different events. Now this set of events is undetermined as the future – that exists only in imagination – is. (Ponsonnet 1996, 171)

Shackle’s anti-realism is motivated by what most theorists would regard as a philosophical howler: he regards realism about the future as incompatible with human freedom, and holds that human beings are free. The second premise here seems harmless enough, but the first is notoriously difficult to motivate. Nevertheless, there are some better arguments than this for anti-realism about the future. If we adopt these, it isn’t clear why we should ‘assign certainty’ to the set of possibilities.

Shackle is here assuming that for any proposition \(p\), even a proposition about the future, \(p\) \({\vee}\) \({\lnot}\)\(p\) is now true, although neither disjunct is true. Given his interests it seems better to follow Dummett here and say that if we are anti-realists about a subject then for propositions \(p\) about that subject, \(p\) \({\vee}\) \({\lnot}\)\(p\) fails to be true. Hence we have no need to ‘assign certainty to the set of possibilities.’ Or perhaps more accurately, assigning certainty to the set of possibilities does not mean assigning probability 1 to \(p\) \({\vee}\) \({\lnot}\)\(p\); in particular, condition (P1) on \(\vdash\)-probability functions does not require this when we choose \(\vdash\) as \(\vdash_{IL}\).

2.3 The third motivation for adopting an intuitionistic approach to probability is that it allows us to retain the Kolmogorov axioms for probability, in particular the Principle of Addition. This principle has, to my mind at least, some intuitive motivation. And the counterexamples levelled against it by heterodox theorists seem rather weak from the intuitionistic Bayesian perspective. For they all are cases where we might feel it appropriate to assign a low probability to a proposition and its negation3. Hence if we are committed to saying Pr(\(A\) \({\vee}\) \({\lnot}\)\(A\)) = 1 for all \(A\), we must give up the Principle of Addition. But the intuitionistic Bayesian simply denies that in these cases Pr(\(A\) \({\vee}\) \({\lnot}\)\(A\)) = 1, so no counterexample to Addition arises. This denial is compatible with condition (P1) on Pr’s being a \(\vdash_{IL}\)-probability function since, as already noted, \(A\) \({\vee}\) \({\lnot}\)\(A\) is not in general a \(\vdash_{IL}\)-thesis.

2.4 The final argument for taking an intuitionistic approach is that it provides a justification for rejecting the positive arguments for classical Bayesianism. These provide a justification for requiring coherent degrees of belief to be representable by the classical probability calculus. There are a dizzying variety of such arguments which link probabilistic epistemology to decision theory, including: the traditional Dutch Book arguments found in Ramsey (1926), Teller (1973) and Lewis (1999); de-pragmatized Dutch Book arguments which rely on consistency of valuations, rather than avoiding actual losses, as in Howson and Urbach (1989), Christensen (1996) and Hellman (1997); and arguments from the plausibility of decision theoretic constraints to constraints on partial beliefs, as in Savage (1954), Maher (1993) and Kaplan (1996). As well as these, there are arguments for classical Bayesianism which do not rely on decision theory in any way, but which flow either directly from the definitions of degrees of belief, or from broader epistemological considerations. A summary of traditional arguments of this kind is in Paris (1994). Joyce (1998) provides an interesting modern variation on this theme.

All such arguments assume classical – rather than, say, intuitionistic – reasoning is appropriate. The intuitionist has a simple and principled reason for rejecting those arguments. The theorist who endorses \(\vdash_{CL}\) when considering questions of inference, presumably lacks any such simple reason. And they need one, unless they think it appropriate to endorse one position knowing there is an unrefuted argument for an incompatible viewpoint.

We are not insisting that non-Bayesians will be unable to refute these arguments while holding on to \(\vdash_{CL}\). We are merely suggesting that the task will be Herculean. A start on this project is made by Shafer (1981), which suggests some reasons for breaking the link between probabilistic epistemology and decision theory. Even if these responses are successful, such a response is completely ineffective against arguments which do not exploit such a link. As we think these are the strongest arguments for classical Bayesianism, non-Baeyesians have much work left to do. And it is possible that this task cannot be completed. That is, it is possible that the only questionable step in some of these arguments for classical Bayesianism is their use of non-constructive reasoning. If this is so only theorists who give up \(\vdash_{CL}\) can respond to such arguments.

In sum, non-Bayesians need to be able to respond to the wide variety of arguments for Bayesianism. Non-Bayesians who hold on to \(\vdash_{CL}\) must do so without questioning the implicit logical assumptions of such arguments. Given this restriction, producing these responses will be a slow, time-consuming task, the responses will in all likelihood be piecemeal, providing little sense of the underlying flaw of the arguments, and for some arguments it is possible that no effective response can be made. Intuitionistic Bayesians have a quick, systematic and, we think, effective response to all these arguments.

More on Intuitionistic Probability Functions

Having explained the motivation for intuitionistic Bayesianism, let us turn our attention in greater detail to its main source of novelty: the intuitionistic probability functions. We concentrate on logical matters here, in the following section justifying the singling out of this class of probability functions by showing that an epistemic state represented by Bel is invulnerable to a kind of Dutch Book if and only if Bel is an intuitionistic probability function.

For the case of specifically classical probability functions, the conditions (P0)–(P4) of Section 1 involve substantial redundancy. For example, we could replace (P2) and (P3) by – what would in isolation be weaker conditions – (P2\(^\prime\)) and (P3\(^\prime\)).

(P2\(^\prime\))

If \(A\) \(\dashv\) \(\vdash\) \(B\) then Pr(\(A\)) = Pr(\(B\))

(P3\(^\prime\))

If \(\vdash\) \({\lnot}\)(A \({\wedge}\) B) then Pr(\(A\) \({\vee}\) \(B\)) = Pr(\(A\)) + Pr(\(B\))

However, in the general case of arbitrary \(\vdash\)-probability functions (or rather: those for which \({\lnot}\) is amongst the connectives of the language of \(\vdash\)), such a replacement would result in a genuine weakening, as we may see from a consideration of the class of \(\vdash_{IL}\)-probability functions. While both (P2\(^\prime\)) and (P3\(^\prime\)) are satisfied for \(\vdash\) as \(\vdash_{IL}\), the class of functions Pr satisfying (P0), (P1), (P2\(^\prime\)) and (P3\(^\prime\)) is broader (for this choice of \(\vdash\)) than the class of intuitionistic probability functions. To see this, first note that the function P, defined immediately below, satisfies (P0), (P1), (P2) and (P3\(^\prime\)), but not (P3).

\[P(A) = \begin{cases} 1 \text{ if } p \vee q~ \vdash_{IL}~ A \\ 0 \text{ otherwise} \end{cases}\]

(Here \(p\) and q are a pair of atomic sentences.) To see that (P3\(^\prime\)) is satisfied, assume P(\(A\) \({\vee}\) \(B\)) = 1 and \(\vdash_{IL}\) \({\lnot}\)(\({\wedge}\) \(B\)). Then \(p\) \({\vee}\) q \(\vdash_{IL}\) \(A\) \({\vee}\) \(B\), and \(B\) \(\vdash_{IL}\) \({\lnot}\)\(A\). Hence \(p\) \({\vee}\) q \(\vdash_{IL}\) \(A\) \({\vee}\) \({\lnot}\)\(A\), but this only holds if either (1) \(p\) \({\vee}\) q \(\vdash_{IL}\) \(A\) or (2) \(p\) \({\vee}\) q \(\vdash_{IL}\) \({\lnot}\)\(A\). (For if \(p\) \({\vee}\) q \(\vdash_{IL}\) \(A\) \({\vee}\) \({\lnot}\)\(A\), then \(p\) \(\vdash_{IL}\) \(A\) \({\vee}\) \({\lnot}\)\(A\) and q \(\vdash_{IL}\) \(A\) \({\vee}\) \({\lnot}\)\(A\), whence by a generalization, due to Harrop, of the Disjunction Property for intuitionistic logic, either \(p\) \(\vdash_{IL}\) \(A\) or \(p\) \(\vdash_{IL}\) \({\lnot}\)\(A\) and similarly either q \(\vdash_{IL}\) \(A\) or q \(\vdash_{IL}\) \({\lnot}\)\(A\). Thus one of the following four combinations obtains: (a) \(p\) \(\vdash_{IL}\) A and q \(\vdash_{IL}\) \(A\), (b) \(p\) \(\vdash_{IL}\) \(A\) and q \(\vdash_{IL}\) \({\lnot}\)\(A\), (c) \(p\) \(\vdash_{IL}\) \({\lnot}\)\(A\) and q \(\vdash_{IL}\) \(A\), (d) \(p\) \(\vdash_{IL}\) \({\lnot}\)\(A\) and q \(\vdash_{IL}\) \({\lnot}\)\(A\). But cases (b) and (c) can be ruled out since they would make \(p\) and q \(\vdash_{IL}\)-incompatible, contradicting their status as atomic sentences, and from (a) and (d), (1) and (2) follow respectively.) If (1) first holds then P(\(A\)) = 1, as required. If (2) holds then \(p\) \({\vee}\) q \(\vdash_{IL}\) (\(A\) \({\vee}\) \(B\)\({\wedge}\) \({\lnot}\)\(A\) and (\(A\) \({\vee}\) \(B\)\({\wedge}\) \({\lnot}\)\(A\) \(\vdash_{IL}\) \(B\), so P(\(B\)) = 1. The other cases are trivial to verify and are left to the reader.

To see (P2) is needed (for the current choice of \(\vdash\)), as opposed to just (P2\(^\prime\)), consider the following Kripke tree.

We introduce a “weighting” function w by setting w(1) = 0.2, w(2) = 0.3, w(3) = -0.1 and w(4) = 0.6. For any \(A\), let P(\(A\)) = \({\Sigma}\)w(i), where the summation is across all points i that force \(A\). So P(\(p\)) = 0.6 and P(\({\lnot}{\lnot}\)\(p\)) = 0.5, contradicting (P2). But (P0), (P1), (P2\(^\prime\)) and (P3) are all satisfied, showing that (P2) is in the general case not derivable from these three conditions.

Bets and Intuitionistic Probability Functions

Say that an \(A\)-bet is a bet that pays $1 if \(A\) and nothing otherwise. These will sometimes be called bets on \(A\). In this theory, as in real life, it is possible that neither \(A\)-bets nor \({\lnot}\)\(A\)-bets will ever be collected, so holding an \(A\)-bet and a \({\lnot}\)\(A\)-bet is not necessarily as good as holding $1. An \(A\)-bet becomes a winning bet, i.e. worth $1, just when it becomes known that \(A\). We will assume that bookmakers and punters are both logically proficient and honest, so that when a \(B\)-bet becomes a winning bet and \(B\) \(\vdash_{IL}\) \(A\), then an \(A\)-bet is a winning bet. The picture underlying this story is the Kripke tree semantics for intuitionistic logic. Bettors are thought of as being at some node of a Kripke tree, an \(A\)-bet wins at that stage iff \(A\) is forced by that node. Bettors do not know that any future nodes will be reached, so they cannot be confident that all bets on classical tautologies (\(\vdash_{CL}\)-theses) will be winning. And more importantly, we take it that an (\(A\) \({\vee}\) \(B\))-bet wins if and only if an \(A\)-bet wins or a \(B\)-bet wins. Again this mirrors the fact that \(A\) \({\vee}\) \(B\) is forced at a node iff \(A\) is forced or \(B\) is forced.

Finally, to get the Dutch Book style argument going, assume that for any sequence of bets on \(A\)1, \(A\)2, ..., \(A\)k, the bettor values the sequence at $(Bel(\(A\)1) + Bel(\(A\)2) + ... + Bel(\(A\)k)). This is obviously unrealistic and economically suspect4, but is perhaps a useful analogy. Then Bel leads to coherent valuations in all circumstances iff Bel is a intuitionistic probability function. That is, if Bel is not an intuitionistic probability function (henceforth: IPF) then there will be two finite sequences of bets S1 and S2 such that S1 is guaranteed to pay at least as much as S2 in all circumstances, but S2 is given higher value by the agent. For simplicity Bel will be called incoherent if this happens, and coherent otherwise. If Bel is an IPF there are no two such sequences, so it is coherent.

If Bel is not an IPF then we just need to look at which axiom is breached in order to construct the sequences. For example, if (P3) is breached then let the sequences be \(\langle\)\(A\), \(B\)\(\rangle\) and \(\langle\)\(A\) \({\vee}\) \(B\)\({\wedge}\) \(B\)\(\rangle\). The same number of propositions from each sequence are forced at every node of every Kripke tree, so the coherence requirement is that the two sequences receive the same value. But ex hypothesi they do not, so Bel is incoherent. Similar proofs suffice for the remaining axioms (the remaining conditions on \(\vdash\)-probability functions, that is, as they apply in the special case of \(\vdash\) = \(\vdash_{IL}\)).

To show that if Bel is an IPF it is coherent, we need some more notation. Let \(\langle\)\(A\)1, ..., \(A\)k\(\rangle\) be a sequence of propositions. Then say cn, k is the proposition true iff at least n of these are true. So c2,3 is the proposition (\(A\)1 \({\wedge}\) \(A\)2) \({\vee}\) (\(A\)1 \({\wedge}\) \(A\)3\({\vee}\) (\(A\)2 \({\wedge}\) \(A\)3). Assuming Bel is a IPF, we prove the following lemma holds for all k:

The proof is by induction on k. For k=1 and k=2, the proof is given by the axioms. So it remains only to complete the inductive step. For ease of reading in the proof we write \(A\) for Bel(\(A\)) where no ambiguity would result.

By the inductive hypothesis we have:

Since \(c_{i,k} \vee A_{k+1}\) \(\dashv\) \(\vdash\) \(c_{i,k+1} \vee A_{k+1}\) and \(c_{i,k} \wedge A_{k+1}\) \(\dashv\) \(\vdash\) \(c_{i+1,k+1} \wedge A_{k+1}\) we have:

Now, \(c_{1,k+1} \vee~ A_{k+1}\) \(\dashv\) \(\vdash\) \(c_{i,k+1}\) and \(c_{k+1,k+1} ~\wedge~ A_{k+1}\) \(\dashv\) \(\vdash\) \(c_{k+1,k+1}\) from the definitions of \(c\). So substituting in these equivalences and slightly renumbering, we get:

Regrouping the last two summations and applying (P3),

And cancelling out the second term on each side gives us the result we want. From this it follows immediately that Bel is coherent. Let S1 and S2 be any two sequences such that S1 is guaranteed to pay as much as S2. That is, that S2 pays $n entails S1 pays at least $n for all n. Now the lemma shows that for each sequence of bets, their value equals the sum of the probability that they’ll pay at least n for all values of n, up to the length of the sequence. So by as many appeals to (P2) as there are bets in S1, we have that the value of S2 is less than or equal to the value of S1, as required.

Given the well-known problems with Dutch Book arguments5, it might be wondered if we can give a different justification for the axioms. Indeed it may be considered helpful to have a semantics for the logic which does not refer to betting practices. One possibility is to say that IPFs are normalised measures on Kripke trees. The idea is that the probability of a proposition is the measure of the set of points at which the proposition is forced. It is straightforward to give a non-constructive proof that the axioms are sound with respect to these semantics, but making this proof constructive and providing any proof that the axioms are complete is a harder task. So for now this Dutch Book justification for the axioms is the best available.

Appendix: The Morgan–Leblanc–Mares Calculus

In a series of papers (Morgan and LeBlanc (1983a, 1983b), Morgan and Mares (1995)) an approach to probability grounded in intuitionistic logic has been developed. The motivation is as follows. A machine contains an unknown set of propositions S, which need not be consistent. Pr(\(A\), \(B\)) is the maximal price we’d pay for a bet that S and \(B\) intuitionistically entail A (SA \(\vdash_{IL}\) B, that is). By standard Dutch Book arguments, we obtain axioms for a probability calculus which has some claim to being constructivist. The point of this section is to register the shortcomings of this approach as a theory of uncertain reasoning from evidence – to point out, that is, the implausibility of interpreting the axioms they derive as normative constraints on degrees of belief. (It should be noted from the start that this was not the advertised purpose of their theory, and at least one of the authors (Mares) has said (p.c.) that the primary purpose of constructing these theories was to generalise of the triviality results proved in Lewis (1976). So the purpose of this appendix may be to argue for something that isn’t in dispute: that these theories can’t be pushed into double duty as theories of reasoning under uncertainty.)

The axiomatisations given in the Morgan and Leblanc papers differs a little from that given in the Morgan and Mares paper, but the criticisms levelled here apply to their common elements. In particular, the following four axioms are in both sets.

(C1)

0 \({\leq}\) Pr(\(A\), \(B\)\({\leq}\) 1

(C2)

Pr(\(A\), \(A\) \({\wedge}\) \(B\)) = 1

(C3)

Pr(\(A\), \(B\) \({\wedge}\) CPr(\(B\), C) = Pr(\(B\), \(A\) \({\wedge}\) CPr(\(A\), C)

(C4)

Pr(\(A\) \({\supset}\) \(B\), C) = Pr(\(B\), \(A\) \({\wedge}\) C)

These four are enough to get both the unwanted consequences. In particular, from these we get the ‘no negative evidence’ rule: Pr(\(A\), \(B\) \({\wedge}\) C\({\geq}\) Pr(\(A\), \(B\)). The proof is in Morgan and Mares (1995) Now given the semantic interpretation they have adopted, this is perhaps not so bad. After all, if we can prove \(A\) from \(B\) and S, we can certainly prove it from \(B\) \({\wedge}\) C and S, but the converse does not hold. However from our perspective this feature seems a little implausible. In particular, if C is \({\lnot}\)\(A\), it seems we should have Pr(\(A\), \(B\) \({\wedge}\) \({\lnot}\)\(A\)) = 0 unless \(B\) \(\vdash_{IL}\) \(A\), in which case Pr(\(A\)\(B\) \({\wedge}\) \({\lnot}\)\(A\)) is undefined.

It shouldn’t be that surprising that we get odd results given (C4). Lewis (1976) shows that adopting it for a (primitive or defined) connective ‘\({\rightarrow}\)’ within the classical probability calculus leads to triviality. And neither the arguments he uses there nor the arguments for some stronger conclusions in Lewis (1999) rely heavily on classical principles. The papers by Morgan and Leblanc don’t discuss this threat, but it is taken discussed in detail in Morgan and Mares (1995). Morgan and Mares note that it’s possible to build a theory based on (C1) to (C4) that isn’t trivial in the sense Lewis described. But these theories still have enough surprising features that they aren’t suitable for use as a theory of reasoning under uncertainty.

In intuitionistic logic we often take the falsum \({\perp}\) as a primitive connective, functioning as a \(\vdash_{IL}\)-antithesis. Hence a set S is intuitionistically consistent iff we do not have S \(\vdash_{IL}\) \({\perp}\). Now the following seems a plausible condition:

(C\({\perp}\))

For consistent \(B\), Pr(\({\perp}\), \(B\)) = 0.

Given consistent evidence, we have no evidence at all that the falsum is true. Hence we should set the probability of the falsum to 0 (as required by our condition (P0) from Section 1). Given Morgan and Leblanc’s original semantic interpretation there is less motivation for adopting (C\({\perp}\)), since S might be inconsistent. The restriction to consistent \(B\) in (C\({\perp}\)) is imposed because we take Pr(\(A\), \(B\)) to be undefined for inconsistent \(B\), as explained at the end of Section 1. (In more detail: if \(B\) is a \(\vdash_{IL}\)-antithesis then Pr(\(B\)) = 0 for any intuitionistic probability function Pr, whence the undefinedness of Pr(\(A\), \(B\)) by the remarks at the end of that section.) Morgan, Leblanc and Mares take it to be set at 1. The choice here is a little arbitrary, the only decisive factor being apparently the easier statement of certain results. Now if we take the falsum as a primitive the next move is usually to introduce \({\lnot}\) as a defined connective, as follows.

\({\lnot}\)\(A\) =df \(A\) \({\supset}\) \({\perp}\)

Assuming \(A\) \({\wedge}\) B is consistent, it follows from (C4) and (C\({\perp}\)) that Pr(\({\lnot}\)\(A\), \(B\)) = 0. Again, from our perspective this is an implausible result. The main purpose of this appendix has been to show that the Morgan–Leblanc–Mares probability calculus cannot do the work Bayesians want a probability calculus to do. That is, it is implausible to regard their Pr(\(A\), \(B\)) as the reasonable degree of belief in \(A\) given \(B\). Hence the account of conditional probability these authors offer diverges from the intuitionistic Bayesianism that we have been urging heterodox theorists to endorse.

Christensen, David. 1996. “Dutch-Book Arguments de-Pragmatized: Epistemic Consistency for Partial Believers.” Journal of Philosophy 93 (9): 450–79. https://doi.org/10.2307/2940893.
Cohen, L. Jonathan. 1977. The Probable and the Provable. Oxford: Clarendon Press.
Dempster, Arthur. 1967. “Upper and Lower Probabilities Induced by a Multi-Valued Mapping.” Annals of Mathematical Statistics 38: 325–39. https://doi.org/10.1214/aoms/1177698950.
Harman, Gilbert. 1983. “Problems with Probabilistic Semantics.” In Developments in Semantics, edited by Alex Orenstein and Rafael Stern, 243–37. New York: Haven.
Hellman, Geoffery. 1997. “Bayes and Beyond.” Philosophy of Science 64 (2): 191–221. https://doi.org/10.1086/392548.
Howson, Colin, and Peter Urbach. 1989. Scientific Reasoning. La Salle: Open Court.
Joyce, James M. 1998. “A Non-Pragmatic Vindication of Probabilism.” Philosophy of Science 65 (4): 575–603. https://doi.org/10.1086/392661.
Kaplan, Mark. 1996. Decision Theory as Philosophy. Cambridge: Cambridge University Press.
Lewis, David. 1976. “Probabilities of Conditionals and Conditional Probabilities.” Philosophical Review 85 (3): 297–315. https://doi.org/10.2307/2184045.
———. 1999. “Why Conditionalize?” In Papers in Metaphysics and Epistemology, 403–7. Cambridge University Press.
Maher, Patrick. 1993. Betting on Theories. Cambridge: Cambridge University Press.
Morgan, Charles, and Hughes LeBlanc. 1983a. “Probabilistic Semantics for Intuitionistic Logic.” Notre Dame Journal of Formal Logic 24 (2): 161–80. https://doi.org/10.1305/ndjfl/1093870307.
———. 1983b. “Probability Theory, Intuitionism, Semantics and the Dutch Book Argument.” Notre Dame Journal of Formal Logic 24 (3): 289–304. https://doi.org/10.1305/ndjfl/1093870372.
Morgan, Charles, and Edward Mares. 1995. “Conditionals, Probability and Non-Triviality.” Journal of Philosophical Logic 24 (5): 455–67. https://doi.org/10.1007/bf01052599.
Paris, J. B. 1994. The Uncertain Reasoner’s Companion: A Mathematical Perspective. Cambridge: Cambridge University Press.
Ponsonnet, Jean-Marc. 1996. “The Best and the Worst in g. L. S. Shackle’s Decision Theory.” In Uncertainty in Economic Thought, edited by Christian Schmidt, 169–96. Cheltham: Edward Elgar.
Ramsey, Frank. 1926. “Truth and Probability.” In Philosophical Papers, edited by D. H. Mellor, 52–94. Cambridge: Cambridge University Press.
Savage, Leonard. 1954. The Foundations of Statistics. New York: John Wiley.
Schmeidler, David. 1989. “Subjective Probability and Expected Utility Without Additivity.” Econometrica 57 (3): 571–89. https://doi.org/10.2307/1911053.
Shackle, George. 1949. Expectation in Economics. Cambridge: Cambridge University Press.
Shafer, Glenn. 1976. A Mathematical Theory of Evidence. Princeton: Princeton University Press.
———. 1981. “Constructive Probability.” Synthese 48 (1): 1–60. https://doi.org/10.1007/bf01064627.
Teller, Paul. 1973. “Conditionalization and Observation.” Synthese 26 (2): 218–58. https://doi.org/10.1007/bf00873264.
Zadeh, Lofti A. 1978. “Fuzzy Sets as a Basis for a Theory of Probability.” Fuzzy Sets and Systems 1 (1): 3–28. https://doi.org/10.1016/0165-0114(78)90029-5.

  1. For more details, see Zadeh (1978), Dempster (1967), Shafer (1976), Shackle (1949), Cohen (1977), Schmeidler (1989).↩︎

  2. This assumption was shared by many of the participants in the symposium on probability in legal reasoning, reported in the Boston University Law Review 66 (1986).↩︎

  3. Again the discussion in (Shafer 1976 ch. 2) is the most obvious example of this, but similar examples abound in the literature.↩︎

  4. It is economically suspect because, in simplified terms, Bel(\(A\)) gives at best the use-value of an \(A\)-bet, but this is distinct from the exchange-value the agent places on the bet. And it is the exchange-value that determines her patterns of buying and selling.↩︎

  5. See Maher (1993) for criticisms of the most recent attempts at successful Dutch Book arguments and references to criticisms of earlier attempts.

    ↩︎

References