When do two injections make a bijection?

On some proofs of the Schroeder-Bernstein theorem.

A version of this article originally appeared in The Oxford Invariants Society magazine The Invariant in Trinity Term 2015.

1. Motivation: Sets, sizes and injections

How do we know whether two sets are the same size, in other words, that they have the same number of elements? For small, finite sets, it is straightforward: count the number of elements in each set and if the answer is the same, then they have the same size. But what about sets that are very large, or infinite? Using the Diagonal Argument (1), Georg Cantor (1845-1918) proved that there are different sizes of infinity: the set of natural numbers is ‘countably’ infinite whilst the set of real numbers is bigger—‘uncountably’ infinite. Much of set theory research is concerned with investigating the size and ordering of different infinite sets. In the face of these complications, we need a better way to compare the sizes of different sets.

Cantor defined two sets as having the same cardinality (same size) if and only if there exists a bijective function between them, i.e. a one-to-one correspondence between their elements. In practice, however, it can oftentimes be tricky to define the necessary bijection—and easier to define an injective function from each set into the other. If there exist injective mappings in each direction, then we intuitively feel that the two sets ought to be of equal size. But is this true?

The Schroeder-Bernstein theorem says Yes: if there exist injective functions $f: A \to B$ and $g: B \to A$ between sets $A$ and $B$ , then there exists a bijection $h: A \leftrightarrow B$ and so, by Cantor’s definition, $A$ and $B$ are the same size ( $|A| = |B|$ ). Furthermore, if we go on to define $A$ as having cardinality greater than or equal to $B$ ( $|A| \geq |B|$ ) if and only if there exists an injection $g: B \to A$ , then the theorem states that $|A| \geq |B|$ and $|B| \geq |A|$ together imply $|A| = |B|$ . Thus, in set theory the ordering of infinite cardinalities can work as we expect.

2. At the turn of the twentieth century

Like many important theorems in mathematics, the Schroeder-Bernstein theorem has a convoluted history and is arguably misnamed. Cantor first proposed the theorem in 1887 in a letter to Richard Dedekind (1831-1916), but did not publish a proof at that time; the theorem is nevertheless often referred to as the Cantor-Schroeder-Bernstein theorem, or simply the Cantor-Bernstein theorem. Ernst Schroeder (1841-1902) offered a proof in 1896, but this quickly turned out to be erroneous. The following year, both Schroeder and Felix Bernstein (1878 – 1956), who was then an undergraduate student of Cantor’s, independently found correct proofs. Bernstein’s proof became widely known when it was presented by Cantor at the International Congress of Mathematicians in 1897.

Dedekind, in the meantime, had quietly written a proof back in 1887 that, for reasons of his own, he neither published nor notified Cantor about. This proof was not discovered until 1908. Dedekind went on to prove the theorem at least once more, but his name, of course, is never associated with it!

3. ‘A proof of the Schroeder-Bernstein theorem’

Twentieth century proofs of the Schroeder-Bernstein theorem accumulated. R. H. Cox of the University of Kentucky offered a little-known alternative proof in a short note published by The American Mathematical Monthly in May 19683 (2). With a certain amount of paraphrasing, we explain and summarise his approach here. In the opinion of R. H. Cox, this proof ‘is more easily followed and smoother than any of the usual arguments’ (we leave up to the reader to decide whether not they agree).

First we establish the following lemma:

Lemma. If is a set, and , and is an injective function from A into B, then there exists a injective function from onto , that is, is a bijection.

Proof. If $A = B$ , then the identity function is such an $h$ . Therefore, suppose $B$ is a proper subset of $A$ . Define $f^{n}(x)$ such that $f^{0}$ is the identity function and, for each positive integer $k$ , $f^{k} = f^{k-1} \circ f$ . Let $C$ be the set of all elements $y \in A$ for which there exists a non-negative integer $n$ and an element $x \in A\setminus B$ such that $y = f^{n}(x)$ . Then the following hold:

$A \setminus B \subseteq C$ , since for all $x \in A\setminus B$ , $x = f^{0}(x)$ ;
For all $y \in C$ , $x$ and $n$ are unique. This follows from the assumption that $f$ is injective;
$f(C) \subseteq C$ , by the definition of $C$ —for every $a \in f(C)$ , $a = f(y) = f(f^{n}(x)) = f^{n+1}(x)$ for some $y = f^{n}(x) \in C$ , some non-negative integer $n$ and some $x \in A\setminus B$ . Therefore $a \in C$ .

For example, in Fig. 1, since $x = f^{0}(x)$ , $x_{1} = f^{0}(x_{1})$ , and $x_{2} = f^{0}(x_{2})$ , so $x$ , $x_{1}$ , and $x_{2}$ are in $C$ . Since $y_{1} = f^{1}(x_{1})$ and $y_{2} =f^{3}(x_{2})$ , so $y_{1}$ and $y_{2}$ are in $C$ .

Figure 1. Some elements in subset C of A.

For each $z \in A$ , define function $h(z)$ as follows:

h(z) = \begin{cases} f(x) & z \in C \\ x & z \notin C \end{cases}

Then $h$ is a bijection $A \leftrightarrow B$ :

Injection: for all $z \in C$ , this follows from injectivity of $f$ ; for $z \notin C$ this follows from identity;
Surjection: if $b \in B$ and $b \in C$ , then for some positive $n$ , $(n\neq 0)$ , and some $x \in A\setminus B$ , $b = f^{n}(x) = f(f^{n-1}(x)) = f(y)$ where $y = f^{n-1}(x)$ i.e. $y \in C$ . Therefore $f^{-1}(b)$ is accounted for in the first part of the definition of $h$ ; if $b \notin C$ , again this follows from identity. $\Box$

Theorem. For sets $A$ , $B$ , if functions $f: A \to B$ and $g: B \to A$ are injective, then there is a bijection between $A$ and $B$ .

Proof. Since $g \circ f$ is an injective function from $A$ to $g(B) \subseteq A$ , then by the lemma there is a bijection $h: A \to g(B)$ . Thus, $g_{-1} \circ h$ is a bijection between $A$ and $B$ (see Fig. 2). $\Box$

Figure 2.

4. On other proofs and some proof properties

There are many ways to prove the Schroeder-Bernstein theorem. It is worth surveying some of the earliest approaches taken soon after the theorem was first proposed by Cantor. Back then, it was known as the ‘Equivalence theorem’ and was usually stated in its two-set formulation: for sets $A$ and $B$ , if there exists a bijection between $B$ and a proper subset $A_{1}$ of $A$ and a bijection between $A$ and a proper subset $B_{1}$ of $B$ , then there exists a bijection between $A$ and $B$ (4). Cantor himself derived the theorem as a consequence of the well-ordering of the cardinals, a statement that he was not able to prove and which later turned out to be equivalent to the Axiom of Choice (AC). Bernstein, in his proof, noted that given the two-set formulation, every proper subset of $B$ must correspond by a bijection to some proper subset of $A_{1}$ , which for proper subset $B_{1}$ we can call $A_{2}$ , and that $A_{2}$ has the same size as $B_{1}$ and hence the same size as $A$ . He iterated this process to set up an infinite sequence of nested sets $A$ , $A_{1}$ , $A_{2}$ , $A_{3}$ ,… such that all sets of even index have the same size as $A$ and all sets of odd index have the same size as $B$ . This construction allows for a reformulation of sets $A$ and $A_{1}$ in such a way as to demonstrate their size equivalence. It is said that Bernstein thought of this proof while shaving, and that the idea of infinitely nested subsets may have been inspired by nested infinite reflections seen in two mirrors facing each other (5).

On the other hand, the proof currently presented on the Wikipedia entry for the Schroeder-Bernstein theorem (6), attributed to Julius König (1848-1913), has some similarities to Cox’s proof above. Assuming (without loss of generality) that $A$ and $B$ are disjoint, it is possible to construct for each $a \in A$ and each $b \in B$ a unique sequence of elements by repeatedly applying $f$ and $g$ to go right and $f^{-1}$ and $g^{-1}$ to go left (a given sequence may terminate to the left where $f^{1}$ or $g^{-1}$ is not defined). It turns out that the sequences form a partition of the (disjoint) union of $A$ and $B$ , and, proceeding by different cases of sequence termination, for each case either $f$ or $g$ will provide the necessary bijection. Quite a different proof flavour is found in the Oxford undergraduate mathematics Part B Set Theory course (7); here, the proof rests on Tarski’s fixed point theorem (Knaster-Tarski theorem) applied to the power set lattice as a lemma.

It is also interesting to consider some properties of proofs of the Schroeder-Bernstein theorem. Although Cantor originally derived the theorem (indirectly) as corollary of the Axiom of Choice, its proof is actually independent of AC; see, for example, Cox’s proof above. However, there is no way to avoid the Law of the Excluded Middle in the process of proving the theorem, meaning that its various proofs are non-constructive. The theorem is therefore not provable under intuitionistic (constructive) logic.

Taking a categorical approach, we observe that we have so far considered the classical Schroeder-Bernstein theorem as originally formulated for the category of sets with the injection relation. The theorem can be generalised to the Schroeder-Bernstein property: for mathematical objects $X$ and $Y$ , if there exists a monomorphism (injection, or embedding) from $X$ to $Y$ and also from $Y$ to $X$ , then there exists an isomorphism between them. Formulated in this way, the Schroeder-Bernstein property holds for categories of vector spaces, compact metric spaces and well-ordered sets, but not, for example, for general topological spaces.

Finally—in case anyone was worried—the correctness of the classical Schroeder-Bernstein theorem has been established through proof formalisation in almost all the most popular proof-checking software systems, including HOL Light, Mizar, Coq, Is- abelle, Metamath and ProofPower (8). We can breathe a sigh of relief!

~ Footnotes & References ~

(1) G. Cantor (1892) ‘Ueber eine elementare Frage der Mannigfaltigkeitslehre’. Jahresbericht der Deutschen Mathematiker-Vereinigung 1890–1891 1:75-78.

(2) I encountered a citation to Cox’s article in S. R. Lay’s excellent undergraduate textbook, ‘Analysis: With an Introduction to Proof’ (Pearson), currently in its 5th edition.

(3) R. H. Cox (1968) ‘A Proof of the Schroeder-Bernstein Theorem’. The American Mathematical Monthly, 75(5):508.

(4) Proofs of the Cantor-Bernstein Theorem: A Mathematical Excursion. A. Hinkis. 2013, Birkhaeuser: Basel.

(5) Ibid.

(6) http://en.wikipedia.org/wiki/SchrderBernstein_theorem

(7) The Schroeder-Bernstein theorem sees its first mention, without proof, in the current Oxford undergraduate mathematics syllabus in Prelims: Analysis I: Sequences and Series non-examinable section dealing with countability and density properties of $\mathbb{R}$ and $\mathbb{Q}$ . The first full proof of the theorem is encountered in the Part B Set Theory course.

(8) http://www.cs.ru.nl/~freek/100/

ULTRAVALIDITY

When do two injections make a bijection?

1. Motivation: Sets, sizes and injections

2. At the turn of the twentieth century

3. ‘A proof of the Schroeder-Bernstein theorem’

4. On other proofs and some proof properties

Leave a Reply Cancel reply

When do two injections make a bijection?￼

1. Motivation: Sets, sizes and injections

2. At the turn of the twentieth century

3. ‘A proof of the Schroeder-Bernstein theorem’

4. On other proofs and some proof properties

Leave a Reply Cancel reply

When do two injections make a bijection?