How the Equivalent Bet Test Actually Works
Rationalists often talk about betting as a way of bringing forth more accurate beliefs. The idea is that, in making a bet, you are financially incentivised to think really carefully about how much credence you give some proposition (the proposition that is being operationalised as a bet).
In a similar vein, Julia Galef writes about the equivalent bet test, a technique, adapted from Douglas Hubbard, for quantifying uncertainty. It works like this. Say you want to figure out how sure you are that P is true, quantitatively. Now you ask yourself, “If P is true, I win ten thousand euro; or if I roll under four on a 1d6, I win ten thousand euro – which of these two choices do I prefer?” If you prefer the first one – getting ten thousand euro if P is true – then ask the same question again but with more favourable chances on the dice roll, like rolling under five or six. If you prefer the latter one – getting ten thousand euro if you roll under four on a 1d6 – then ask again but with less favourable chances on the dice roll, like rolling under three or two. If you can’t decide – if the bets seem equivalent to you – then you have found the probability with which you roughly think P is true, your credence.
This technique has worked pretty well for me. But it’s weird. Say I am trying to determine how much credence I give the following proposition:
Russia will invade Ukrainian territory before 2023.
What I am doing with Hubbard’s technique is replacing that decision with another decision, or rather iterating on a series of other decisions:
Choose “I get ten thousand euro if Russia invades” or “I get ten thousand euro if a 1dN die rolls under X”.
Again, only when I find that it is all but impossible to decide this have I made my decision about how much credence I give the original proposition.
Here is the weird thing. When I decide whether to believe a thing, I am making a judgment. My goal in making this judgment is to have true beliefs, to have beliefs that properly describe the territory. (This is not necessarily always the goal, but it is often enough. Humour me and suppose that it is my goal in this case at least.) When making judgments, we use evidence. To determine my credence in the Russian invasion example, I may search for evidence about base rates, recent developments along the border, Russian and Ukrainian military capacity, domestic and international politics and so on. But the equivalent bet test does not add any new evidence. So why does it seem to work?
Maybe it encourages me to search for additional evidence? But no, it doesn’t seem like it does, at least not for me.
Maybe it changes the way I use the evidence in evaluating the two possibilities (“Russia invades” and “Russia doesn’t invade”)? But no, it doesn’t do this either. It’s more intuitive than that. I rarely think about the evidence when I do it; I usually only think about the intuitive appeal of the two bets.
Maybe it changes my goal? Maybe now, instead of wanting to have true beliefs, I want to get rich. Maybe this makes me more honest, because there is more at stake. As Scott Alexander put it:
You know where you stand with greed. You never wonder if greed has an ulterior motive, because it’s already the most ulterior motive there is. Greed feels no temptation to corruption, because the thing it would do if it were corrupt is precisely what it’s doing anyway.
That still does not seem like the main thing. For one, even if my ultimate (if pretend) goal is now to get rich, having true beliefs is still a subgoal that is both necessary and sufficient for completing the main goal. True, I could imagine that, if I walked to the post office to post a letter because I wanted to post that letter, and if I walked to the post office to post a letter because doing so would grant me ten thousand euro – in that latter case, I could imagine that I would be more careful and diligent in bringing my letter to the post office and making sure it got sent, exactly because there is more at stake. But if there being more at stake in the belief-determining situation neither compels me to search for more evidence nor changes how I use the evidence to evaluate the possibilities, it does not seem like it makes me more careful or diligent when determining how much credence I give “Russia will invade”.
But now it strikes me that there is something else happening here. There is a step after determining the goal, after searching for evidence, after evaluating that evidence – the step where I take my assessment (which at this point is little more than a vague feeling) and turn it into a number. When a painter, having conceived the idea for her artwork, having made sketches and studies in preparation of it, and having painstakingly brought it to existence, stands before her completed work and searches for a suitable title – putting a numerical value on a credence is like this. What kept me from seeing this was that I did not properly distinguish between deciding how much credence to give and describing how much credence I do give.
The additional step is a process similar to the one of determining my credence in the thing. Again I am making a judgment. The goal is to translate my assessment into a numerical value, a probability ratio. In doing so, I search for possibilities (“10%” or “30%”, say) and evaluate those possibilities (“10% seems too low” or “30% seems to fit how I think about it”). Now it is easy to see how the equivalent bet test works. It is used in coming to a decision at this second step. It works by making me do a systematic search for possibilities and making me evaluate those possibilities in a consistent (if subjective) way.
This is why the equivalent bet test is a way of, as Galef puts it, “[pinning] down how sure you are […], helping you put a number on your degree of confidence”. In other words, it is never enough to make an equivalent bet test if one wants to have true beliefs, because the test will only help us in quantifying our credence, but never in determining it.
Galef, J. (2021). The Scout Mindset: Why Some People See Things Clearly and Others Don’t. Penguin. ↩︎
Or any other die. Hubbard imagines spinning a Wheel of Fortune-style dial; Galef imagines pulling black and white marbles from a bag. ↩︎
I don’t have actual evidence of this. The best way of testing these things would be to make falsifiable predictions on platforms like Metaculus, which I do only occasionally. The equivalent bet test usually allows me to reach a decision that, after the fact, seems more correct to me than my previous view. Perhaps using this as evidence for the test’s usefulness amounts to a tautology. ↩︎
This sentence and the sentences that follow draw from Jonathan Baron’s search-inference framework.
Baron, J. (2000). Thinking and deciding. Cambridge University Press. ↩︎
In this sentence and the subsequent sentences, I am describing my own experience. It could of course be that I am doing the test suboptimally, and that others do it in a way that does encourage them to bring in new evidence or to evaluate the evidence differently. That said, Galef seems to describe a process more like mine. She writes, for example, “I hesitate for a moment, but I feel happier with the [dice] bet”, and, “This time, I notice that I prefer my chances on self-driving cars”, and, “Hmm, I’m really torn. Neither seems like a clearly better bet.” ↩︎