# The Two Envelope paradox

*693*pages on

this wiki

**Two Envelope Paradox Solution**

**The Paradox**

There are two envelopes containing money. One has twice as much money as the other. You choose an envelope, open it, and can keep it, or swap it for the other. Should you keep it or switch?

Method 1. Let the envelopes contain x and 2x. If you find x in your envelope then by switching to the other envelope you gain x. If you find 2x then switching loses x. The net gain is zero. Therefore switching is of no benefit.

Method 2. If your envelope contains x then the other one must have 2x or x/2. Switching means either a gain of x or a loss of x/2. Since these are equally probable you should switch.

For a discussion of the problem and a review of the literature see Discussion.

**Synopsis**

Part 1 demonstrates that 2x and x/2 are not equiprobable, though this is not enough to resolve the paradox. Part 2 revives the paradox by presenting a specific distribution that appears to give positive gain for switching for all x. This gain calculation is shown to be invalid. Part 3 generalises this result to show that switching is of no benefit in the general case, regardless of whether an envelope is opened or not. The Discussion that follows shows what happens in the finite case and attempts to explain the paradox intuitively. Essentially, the two envelope paradox is a zero-sum game in which the choice of the order of summation can give a false positive value for the gain.

**Part 1: The original paradox**** undermined**

Surprisingly, it is not true that all numbers can occur with equal probability in the envelopes. If it were so, and the probability of any number x was P( x ) = b, a constant value, then because there are infinitely many real numbers, the sum of all the probabilities would be b times infinity, whereas probabilities must add up to 1. So we know that the probability of large values must fall off. More precisely, P( x ) must approach 0 as x approaches infinity. This undermines the switching argument, which posited that the probability of 2x in the other envelope is the same as that of x/2.

You may find this explanation unsatisfactory. The paradox arises because we have the notion that the chain of pairs of envelopes can stretch forever, with each succeeding pair having equal probability but with double the values. In fact, either the chain must be finite or else the probabilities must diminish.

Does this dispose of the two envelope paradox? It doesn't, because the paradox can be resuscitated, as shown below.

**Part 2: A special
case of the paradox**

Consider
the distribution of envelopes containing the pairs ( 1, 2 ), ( 2, 4 ), ( 4, 8
), ( 8, 16 ) and so on, where each such pair ( 2^{n}, 2^{n+1} )
occurs with probability 2^{n}/3^{n+1} for integer n >= 0. It
is not hard to show that the probability sum, 1/3 + 2/9 + 4/27 + 8/81 + ...
converges to a finite value (actually 1), so there is no problem with infinite
probabilities. (Note also that large numbers occur less often than small ones.)
The surprising feature of this distribution of envelopes is that the gain appears
to be positive for **every** value of n. We use method 2 to calculate the
net gain in switching from the value 2^{n} to 2^{n+1 }in the
pair ( 2^{n}, 2^{n+1} ) minus the loss in switching from 2^{n}
to 2^{n-1} in ( 2^{n-1}, 2^{n} ).

Since there is a 50% chance of picking either of the two values in each pair, each term begins with ½. The next part is the probability of this pair of envelopes being chosen and the third part is the gain of switching. The net gain is the sum of the two values.

1/2(
2^{n}/3^{n+1} )( 2^{n+1} - 2^{n} ) + 1/2( 2^{n-1}/3^{n}
)( 2^{n-1} - 2^{n} )

= 1/2( 2^{n}/3^{n+1} )( 2^{n} ) + 1/2( 2^{n-1}/3^{n}
)( -2^{n-1} )

= 2^{2n-1}/3^{n+1} - 2^{2n-3}/3^{n}

= ( 2^{2n-3}/3^{n} )( 4/3 - 1 )

= 2^{2n-3}/3^{n+1} for
n > 0 (G)

Because this is positive for every positive value of n, we are back with the original paradox. For n = 0 the gain is 1/6, but this case has to be treated separately since ‘1’ only appears in the pair ( 1, 2 ). It seems we should switch even without knowing the value of the contents of our envelope. Some people argue that the way out is to observe that the distribution has an infinite mean, so that the expected gain from switching is ∞ − ∞, which is not defined. The flaw in this argument is that we are not interested in the average gain but in the gain for each value of the distribution. Suppose that you could choose between the first and the second value in the pairs ( 1, 3 ), ( 10, 30 ), ( 100, 300 ), ( 1000, 3000 ), ... You certainly would choose the second in every pair. The fact that both have infinite means is irrelevant.

Let's
look at the raw gain for each value in every pair of envelopes, ie the gain in
switching from 1 to 2, the loss in switching from 2 to 1, the gain in going
from 2 to 4, the loss in switching from 4 to 2, etc. Note that G gives us the **net**
gain after these positive and negative values have been summed for each value
of n. The **raw** gain is 1/6( 2 – 1 ) for the first, 1/6( 1 – 2 ) for the
second, and 1/9( 4 – 2 ) for the third, giving:

1/6 - 1/6 + 2/9 - 2/9 + 8/27 - 8/27 + 32/81 - 32/81 + 128/243 - 128/243 + ... (S)

The
terms in this infinite series of alternating values repeat with opposite sign then
increase. It is a well-known property of such series that they **cannot **be
summed. See Footnote 1. Note that the individual terms
of S are given by (4/3)^{n}/6, which increases steadily. So we can
obtain nonsensical results by summing S in various ways. When we calculate the
gain using method 2 we are in effect summing the terms of the series S by
bracketing its terms as in (b) in the footnote:

1/6 + ( -1/6 + 2/9 ) + ( -2/9 + 8/27 ) + ( -8/27 + 32/81 ) + ( -32/81 + 128/243 ) + ... (A)

The sums of the bracketed terms (ie 1/18, 2/27, 8/81 etc) are given by the formula G for n > 0, ie each term gives the gain for a particular value of n. Of course, one is tempted to say that 1/6 - 1/6 + 2/9 - 2/9 + 8/27 - 8/27 + 32/81 - 32/81 + 128/243 - 128/243 + ... is simply zero, but in truth the sum is undefined. In fact, getting zero is exactly what we do when we apply method 1, ie observing that if we have x in our envelope then we gain x by going to 2x, and that if we have 2x then we lose x by going to x. Yet method 1 also sums A, and hence is as invalid as method 2 (the gain calculated from x to 2x vs from x to x/2), though unlike method 2, it does not lead us to a paradoxical conclusion - the insatiable itch to switch.

My contention is that taking the sum of an increasing oscillating series, which is what we do to calculate the individual gains in A, is invalid. Logically it makes sense to bracket the series as in A because this should give us the net gain for each value of n. The catch is that mathematically this is invalid due to the weird behaviour of infinite increasing series whose terms have alternating signs. To understand this intuitively, imagine that the "payback" grows, is being pushed ever further away, and is never realised due to the infinite length of the series.

Of course the problems with oscillating series only occur if they are infinite. Any finite series such as 1 - 1 + 2 - 2 + 3 - 3 + 4 - 4 + 5 - 5 (ie with only 10 terms), can be summed in any order and will always give the same result. Likewise, no finite distribution can generate the paradox because the value at the end of a finite chain of envelopes acts as the payback. This is shown in the postscript.

What
about the open envelope version of the paradox? If we open an envelope and find
that it contains 128, should we switch? Substituting n = 5 in G gives the gain
of 2^{7}/3^{6} = 128/729. It seems that we can validly extract
just one bracketed element from the oscillating series A. Or can we? Recall
that the reason we cannot sum all of A is that changing the order of the terms
changes the sum. By extracting a single bracketed term out of A we are
instantiating the same problem, ie we are choosing a specific order so as to
obtain a specific result. This is arbitrary, and hence is just as fallacious in
the case of a single term as in the infinite one.

**Part 3: The general case**

What
of the general case? This involves every possible distribution as the source of
the two envelopes. If you can see why I invoke the notion of distributions then
read on, else see Footnote 2. Finite chains present no
problems, so let's look at an infinite chain, where the pair ( 2^{n}, 2^{n+1}
) occurs with probability P(n) for n >= 0. What is the gain for switching
for each value of n? Here is the net gain for n = 0, 1, 2 and 3 (for simplicity
I omit the factor of ½ that should precede every term):

P(0) * 1 = P(0)

P(0)
( 2^{0} - 2^{1} )
+ P(1) ( 2^{2} - 2^{1}
) = -
P(0) + P(1) *
2

P(1)
( 2^{1} - 2^{2} ) +
P(2) ( 2^{3} - 2^{2} )
= - P(1) * 2
+ P(2) * 4

P(2)
( 2^{2} - 2^{3} )
+ P(3) ( 2^{4} - 2^{3}
) = -
P(2) * 4 + P(3) * 8

Writing out the total gain for all values we see that our friend, the alternating series, is back:

P(0) - P(0) + P(1) * 2 - P(1) * 2 + P(2) * 4 - P(2) * 4 + P(3) * 8 - P(3) * 8 + P(4) * 16 + …

d_{0}
_{ }- d_{0} + d_{1}
- d_{1 }+ d_{2} - d_{2}
+ d_{3} - d_{3 }+
... (A1)

where
d_{0} ( = P(0) ) is the gain in going from n = 0 to n = 1, -d_{0}
is the loss in going from n = 1 to n = 0, d_{1} is the gain in going
from n = 1 to n = 2 and so on. A1 gives the gain calculated using method 1. The
gain using method 2 is found by bracketing the series as we did in Part 2 (when
we got the series A):

d_{0}
+ ( - d_{0 }+ d_{1} ) + ( - d_{1} + d_{2 })_{
}+ ( - d_{2} + d_{3} ) +
...
(A2)

Is
it valid to sum the above? It may be valid or not, depending on the nature of
the series. Essentially there are two possibilities, the series is **well-behaved
**or it isn't. By "well-behaved" I mean that its terms can be
summed in **any** order to produce the same result (which may be ∞,
-∞, or a real number). One thing we know for sure: if A2 is well-behaved
then it will sum to zero, in which case there is no paradox, as 0 = (d_{0}
- d_{0}) + (d_{1} - d_{1}) etc. If it is not
well-behaved then we should not try to sum it. RIP.

Or
are you still unconvinced? You may think that if the series converges when
summed in a particular order then we should accept the total it calculates and
hence that the paradox is unresolved. Clark and Shackel present a *conditionally
convergent* series for the gain that is perhaps the sharpest instantiation
of the paradox. My answer to this is in Footnote 3.

What
about if we open an envelope in the case where A2 is well-behaved? We can
choose to use the term ( - d_{n - 1} + d_{n} ) for the gain for
the value 2^{n}. Does this revive the paradox? No. Some of these terms will give gains, others losses,
and all will exactly balance out, since A2 is well-behaved. Finally, if we open
an envelope in the case where A2 is **not** well-behaved then the argument
at the end of Part 2 tells us that we cannot choose an arbitrary grouping of
part of A1 and proclaim this to give a valid answer.

You
may object that I have not taken the general case. For instance, the
distribution could be a decreasing one, such as ( 2^{-n}, 2^{-n-1}
) with probability P(n) for n >= 0. It could also be double-ended - an
increasing distribution appended to a decreasing one. I claim that the points
made in the preceding two paragraphs are still valid.

If the open envelope contains 100 you might counter that we are not faced with infinitely many cases but with just two possibilities: ( 100, 50 ) and ( 100, 200 ). This is the very heart of the paradox: the simple situation just described derives from the general case of infinitely many possible distributions of pairs of envelopes, and we cannot ignore the probabilistic origins of our deceptively simple two cases. To see it as an each way bet on ( 100, 50 ) vs ( 100, 200 ) is equivalent to saying that the original distribution consisted of only the two pairs ( 100, 50 ) and ( 100, 200 ) and that one of these has been randomly chosen. You cannot throw away the history of the problem. Still not convinced? Suppose you had a box containing two white pearls and one black one. You withdraw two pearls without looking at them, then you pick one of these and note that it is black. What is the chance that the second pearl is also black? Ms Maple, your sixth form teacher was right, history matters!

**Conclusion**

Whether we open an envelope or not, we should be indifferent to switching because of symmetry: nothing suggests that one envelope is preferable to the other.

**Discussion**

It is sad that such a childishly simple problem has no simple solution. I believe that the two envelope paradox is the simplest logical puzzle that is (a) solvable and (b) has no easy solution, ie no solution that can be readily explained to people without mathematical training. Even now I find the solution unsatisfying.

A paradox such as the two envelope puzzle consists of two arguments which lead to contradictory conclusions. To resolve such a paradox it is not sufficient to claim that one argument is totally sound and therefore that the other must be wrong. The only way to resolve such a paradox is to show where there is a mistake in one of the competing arguments. We must show exactly at what point the argument in favour of switching goes wrong, without reference to the competing argument and even without reference to the contradictions that result from the switching argument. In this case the mistake is the assumption that it is valid to calculate the gain. Method 1 and method 2 sum the same series but in a different way, producing different sums. Both methods are actually invalid. This puzzle is an infinity paradox, though the infinity is hidden.

Essentially, the two envelope paradox is a zero-sum game in which the choice of the order of summation can give a false positive value for the gain.

It is vaguely analogous to Bertrand's Paradox [1] , in which the question asked can be answered in logically correct ways that give different answers. This forces us to conclude that Bertrand’s question is not sufficiently precise. In the two envelope paradox we are able to calculate the gain in a number of ways, each of which is mathematically sound, yet which give different answers for what must be a unique value. In fact we know that the value is actually zero by symmetry. From this we conclude that we cannot make the calculation. Specifically, that we cannot sum an oscillating or conditionally convergent series to give us the answer.

To
understand the solution intuitively let’s look at a real-world version of the
puzzle, where only finite amounts are possible. Let's assume that we know in
advance that the envelopes can only contain one of the pairs ( 1, 2 ), ( 2, 4
), ( 4, 8 ) up to ( 2^{99}, 2^{100} ) and that the chance of
getting each pair is the same, ie 1/100. In the general case it did not matter
whether we opened the first envelope or not. In this particular finite version
it does. If the paradox statement does **not **include opening the envelope
then indifference is still the correct solution. It is true that we will gain,
statistically speaking, if our envelope contains 2^{99} or less, but
all these gains are exactly balanced by the loss if we happen to pick the
envelope that contains 2^{100}. Here is the calculation where we add
the gain in going from x to 2x and subtract the loss in going from x to x/2 (method
2):

If our envelope contains 1 then the gain is 1/200( (2 - 1) ) = 1/200

If our envelope contains 2 then the gain is 1/200( (1 - 2) + (4 - 2) ) = 1/200

If our envelope contains 4 then the gain is 1/200( (2 - 4) + (8 - 4) ) = 2/200

If our envelope contains 8 then the gain is 1/200( (4 - 8) + (16 - 8) ) = 4/200

If
our envelope contains 2^{99} then the gain is 1/200( (2^{98} -
2^{99}) + (2^{100} - 2^{99}) )

= 1/200(
-2^{98} + 2^{99} )
= 2^{98}/200

If
our envelope contains 2^{100} then the gain is 1/200( 2^{99} -
2^{100} ) = -2^{99}/200

Summing,
we find that the expected gain = 1/200( 1 + 1 + 2 + 4 + 8 + ... + 2^{98}
- 2^{99} ) = 0.

We
could have avoided this work. Summing a finite series is not affected by the
order in which we add its terms, ie we know that the answer has to be zero
because of method 1. Applying method 2 above we also got 0 gain because of the
boundary conditions. By contrast, in the** infinite **case there is no
payback, as there is in the last term of the calculation above. This is because
in the general case, no matter what our envelope contains, the other one could
contain twice as much. I assert that the general case of the **finite**
version of the two envelope paradox (ie where the highest value is at most N,
where N is a fixed number, though its value need not be known) will exhibit
payback behaviour as above. Furthermore, I assert that the gain using both
method 2 and method 1 will be zero.

If
the paradox statement for the specific finite case above **does** include
the extra information that we opened an envelope and found 128 then we should **switch**.
Our expected gain is 1/2( (64 - 128) + (256 - 128) ) = 32. What about the
symmetry argument? It no longer applies because we now *know where we are in
the distribution.* Specifically we know that our envelope does not contain
the maximum amount of 2^{100}, so switching is statistically
beneficial. Had we opened our envelope and found 2^{100} we would not
switch. Finally, what about method 1? This too no longer applies because it
only gives us the overall answer for all cases, whereas we are no longer
interested in the solution for all values (which is described above), but only
in the case where our envelope contains 128 and the other holds 64 or 256.

Tad Boniecki

Posted 23 Oct 2006