What's Wrong? Three Probability Paradoxes

Three famous probability paradoxes reveal that probability is not a property of an event but part of a model — and the same question can have different correct answers depending on how the experiment is set up.

You meet your neighbors on a walk with their son. You know they have two children. What's the probability that the second child is also a boy?

It seems like a children's problem where you just need to "remember the formula," but things aren't so straightforward. If you asked a random passerby, they'd probably say 1/2. A math teacher might answer 1/3. Who's right?

In a sense, both are right. Each one is imagining their own way in which the information about the child was obtained. In fact, this is the condition of the problem — just a hidden one.

Contrary to popular belief, probability theory doesn't tell you whether a particular situation is possible. Before calculating anything, you need to lay the groundwork — idealize the observation, understand what exactly we consider random, and build a model of the experiment. Without this, no formulas will help.

The paradoxes we'll discuss aren't logical errors. They're situations where the very concept of probability starts to waver. They don't break the theory, but expose where it demands particular caution. It's in these places that probability theory becomes especially strange — and especially interesting.

In this article — three such stories. In the first, the same fact yields different probabilities depending on how the observation is structured. In the second, the same object can be "random" in multiple ways. And in the third, it's impossible to devise a way to make the problem mathematically rigorous.

Along the way, we'll discuss what a probabilistic model is, geometric probability, and mathematical expectation. And at the end, we'll talk about why in probability theory one problem can have multiple answers and how to live with that.

The Problem with Two Answers

In 1959, the famous American science popularizer Martin Gardner published two nearly identical problems in Scientific American:

Mr. Jones has two children; the older one is a girl. What is the probability that both children are girls?

Mr. Smith has two children. It is known that at least one of them is a boy. What is the probability that both are boys?

The first problem is quite simple. If the older child is a girl, then the younger one is either a boy or a girl. So the probability of two girls is 1/2.

For the second problem, Gardner reasoned as follows: there are four possible pairs — boy/boy, boy/girl, girl/boy, and girl/girl. We know the family has a boy, so three options remain, of which only one fits. Therefore, the probability is 1/3.

The beauty is that two nearly identical questions give different answers. They're easy to confuse — and that's exactly why people make mistakes.

After publication, the editorial office received hundreds of letters from readers explaining why the answer to the second question is 1/2. Later, Gardner acknowledged: the problem is indeed ambiguous. The same text can describe different ways of obtaining information — and lead to different answers.

To understand what's happening in this problem, we need to do what probability theory cannot work without: build a model.

Probability cannot be assigned to a single event. What are the chances that Mr. Smith is the father of two boys? The question is meaningless until we imagine an experiment that can be repeated. For example: we select a random family from a large city and see who was born into it. We'll run this experiment again and again and observe in what fraction of cases both children turn out to be boys. After a large number of experiments, this fraction approximately equals a constant number, which we call probability.

Let's formulate the second question mathematically. A city has families with two children. What is the probability that in a randomly selected family with a son, both children are boys?

But how do we select such a family? We could go through families until we find one with a boy and pick it — then Gardner's reasoning is correct and two boys will appear in every third case. Or we could go through children until we meet a boy and select his family — he has a brother in exactly half the cases, so the answer is 1/2!

Different answers arise not because of the wording, but because of how the experiment is set up. We seem to be asking the same question — but we mean different procedures. That's where the paradox lies.

To better feel this idea, let's solve this problem:

Problem: Five children go on a hike: Andrey, Boris, Vasya, Gleb, and Dasha. Everyone knows each other, except the pairs Andrey–Vasya and Boris–Gleb. What is the probability that a randomly chosen pair of friends turns out to be mixed-gender?

There are eight pairs of friends total. Four of them are two boys, four are a boy and Dasha. So if we simply choose a random pair, the chance is 1/2.

But now let's choose a child, then their friend. If it's Dasha, the pair is definitely mixed-gender. If it's a boy — with probability 1/3. We get: 1/5 + 4/5 × 1/3 = 7/15.

Both here and in the previous paradox, we encounter the same question: how do you choose a random edge in a graph? (The vertices of our graph are children, the edges are pairs of relatives or friends.) You can choose an edge directly, or first choose a vertex, then one of its edges. Each of these models is natural in its own way, but they give different answers.

The Most Random Chord

To understand the next paradox, we'll need to recall some school geometry.

Bertrand's Paradox: A chord is drawn in a circle. What is the probability that it's longer than the side of an equilateral triangle inscribed in the circle?

This seems clear enough: it's a problem requiring careful calculation. But as soon as you start clarifying what exactly "randomly drawn chord" means, it turns out the answer depends on it. There are three beautiful solutions — and all with different results.

This paradox was first described in 1889 by French mathematician Joseph Bertrand in his book "Calculus of Probabilities." But the idea itself is much older. It traces back to the model of geometric probability, proposed by another Frenchman — Georges-Louis Buffon — back in the 18th century.

Imagine: a point falls randomly onto a target. What is the probability that it lands in a marked area? It's considered that the probability is proportional to the area. This is what we'll use when analyzing three different ways to choose a random chord.

Method 1: Random Point on the Circle

We connect two random points on the circle. Due to symmetry, it doesn't matter where we start — we can fix one end and choose the second randomly. We inscribe an equilateral triangle in the circle with a vertex at the fixed point. It divides the circle into three equal arcs. The chord will be longer than the triangle's side if its second endpoint falls on the middle arc. So the probability is 1/3.

Method 2: Random Point on the Radius

We fix a radius of the circle. We choose a random point on it and draw a chord perpendicular to the radius. We inscribe an equilateral triangle so that one of its sides is parallel to the chord. The side of the inscribed triangle bisects the radius: if the point is closer to the center, the chord is longer than the side. This happens in 1/2 of cases.

Method 3: Random Point Inside the Circle

We choose a random point inside the circle. We draw a radius to the point and through it — a chord perpendicular to the radius. It will be longer than the triangle's side if and only if the point lies inside the circle inscribed in the triangle. The radius of the inscribed circle is half the radius of the circumscribed one. So the area is four times smaller. Hence the probability is 1/4.

That's the whole paradox. We asked a simple question and got three different answers — not because someone made a mistake, but because "random selection" can mean very different things. Each model seems natural, but each describes its own experiment. And therefore — its own probability.

The Impossible Game

The next paradox was popularized by none other than Martin Gardner — the same one who published the two-children paradox.

The Wallet Paradox: Two players open their wallets, and whoever has less money gives everything to the other. Is this a fair game?

First, we need to understand what "fair game" even means. Suppose you play a hundred times. In each round you either lose money or gain it. Let's see how much you're up (or down) in total and divide that number by a hundred. This gives the average winnings per game.

Over a large number of repetitions, this average result approaches some constant value — called the expected value of winnings. If it's positive — the game is profitable for you; if negative — unprofitable. If the expected value is zero — the game is fair.

To compute the expected value, we need to average the result across all possible situations. Let's denote the sum in your wallet as X, in your opponent's wallet as Y. Then the experimental result is a pair of random numbers (X, Y), and your winnings are either −X or Y.

The averaging can be organized differently — for example, first fix X and consider all games where your wallet contained exactly that sum. Let X = 1000. In half the cases, your opponent is richer: you lose 1000. In the other half — they're poorer, and you win some amount Y less than 1000. On average, you lose. So the game is unprofitable.

But you can reason the other way: fix the opponent's sum. Let Y = 500. Then in half the cases you have more — you win 500. In the other half — you lose an amount less than 500. On average, you win. So the game is profitable.

Or you can put it more simply: the game is symmetric. The rules are identical for both, so the expected value should be zero — meaning the game is fair.

So where's the error? As you've probably guessed, it's in what we mean by a random number. All the reasoning we just presented assumes a strange property: for any X, the probabilities of "getting less than X" and "getting more than X" are equal. This sounds reasonable — but such a model doesn't exist.

For a random number generator, the probability of "getting less than X" grows with increasing X from 0 to 1. This is called the cumulative distribution function. For a uniform distribution on an interval, it grows linearly. For a normal distribution — according to the Laplace function. But for a generator with our properties, it would always have to equal 1/2. So our entire argument relies on a nonexistent model.

And if you define the generator honestly — with a specific, existing distribution — it turns out the game can be either profitable or unprofitable. But the paradox disappears.

How to Get It Right?

Paradoxes in mathematics don't arise "despite" — they're sought out deliberately. They appear at the moment when old explanations suddenly stop working. This means it's time to figure things out anew and understand what we're actually talking about.

All three problems we discussed are structured the same way: the formulation seems clear, but there are multiple solutions, each giving its own answer. And this isn't a dispute about formulas — it's a conversation about the fact that probability is not a property of an event, but part of a model. It depends on how exactly the experiment is structured.

If you're feeling slightly confused right now — that's a perfectly normal reaction. Suppose there can be multiple models — but how do you know which one is correct? Probability theory itself doesn't answer this question. It helps calculate probabilities — when the model is already given. But in life, everything is reversed: we select the model to fit the situation, and it's not always clear which one fits best.

This is where statistics enters the stage. We consider several models, compute probabilistic characteristics for each, and then compare them with what we see in real data. If a model predicts that an event is nearly impossible, but it happens frequently — something's wrong with the model. There are various methods, but the idea is one: we don't seek truth, we test hypotheses. Whichever model best agrees with reality — that's the working one.

If you've read this far and feel that your trust in probabilistic intuition has been shaken — good. Now it won't whisper the wrong answer to you before you've had a chance to formulate the right question.

And finally, as a bonus, one more paradox:

Exercise: Choose your answer to this question at random. What is the probability that you'll choose the correct answer?
(a) 25%
(b) 50%
(c) 0%
(d) 25%