I wrote these notes in February 2016 for an Analysis 2 tutorial when I was a teaching assistant at McGill, and always intended to put them here eventually; before August 2018 though, I hadn’t translated them to something web-friendly and only had posted a PDFThe web version has slightly improved wording in some parts.
.
Cantor’s Set
Cantor’s set is an interesting subset of \([0,1]\), with properties that help illuminate concepts in analysis. It can often serve as a counter-example or edge-case on which to test ideas, and to construct further unusual objects; one such object is the Cantor function, which we’ll define here as well. To me, one of the most immediate clarifications provided by the Cantor set is the idea that looking at on object in different ways yields different notions of size, and these do not have to play nicely with each other. As we will see, the Cantor set is nowhere dense, yet uncountable.
Construction
We will define a sequence of sets \(\{E_n\}_{n=0}^\infty \subseteq [0,1]\), and define the Cantor set (or “Cantor ternary set”) as \(K = \bigcap_{n=0}^\infty E_n\).
\(E_0\).
To start, take \[\begin{aligned} E_0 = [0,1]. \end{aligned}\]
\(E_1\).
Divide the interval in three equal pieces and throw away the middle one to define \[\begin{aligned} E_1 = [0,\tfrac{1}{3}] \cup [\tfrac{2}{3},1]. \end{aligned}\]
\(E_2\).
Repeat this process by throwing away the middle third of each interval in \(E_1\) to obtain \[\begin{aligned} E_2 = [0,\frac{1}{9}]\cup [\frac{2}{9},\frac{3}{9}]\cup [\frac{6}{9},\frac{7}{9}]\cup [\frac{8}{9},1]. \end{aligned}\]
From top to bottom, \(E_0\) to \(E_4\).
In general, we recursively define \(E_n\) in terms \(E_{n-1}\) by discarding the middle third of the intervals which constitute \(E_{n-1}\). We may write this definition as \[\begin{aligned} E_n = \frac{E_{n-1}}{3}\cup \left( \frac{2}{3}+ \frac{E_{n-1}}{3} \right), \qquad E_0 = [0,1] \end{aligned}\]
where we are using the notation of multipling and adding to sets by multiplying and adding to their elements: for a set \(A\subset \mathbb{R}\) and \(c,d\in \mathbb{R}\), \[\begin{aligned} cA + d := \{ca+d: a\in A\}.\end{aligned}\]
In this way, we create a decreasing sequences of sets \[E_0 \supset E_1 \supset \ldots \supset E_n \supset \ldots.\]
Note: By construction, \(E_n\) is the disjoint union of \(2^n\) intervals, each of length \(3^{-n}\). This is a key property we will use in our proofs of the properties of the Cantor set.
Properties
Let \[K := \bigcap_{n=0}^\infty E_n.\] \(K\) is called the Cantor set, and satisfies the following properties.
Theorem 1
\(K\) is closed.
\(\operatorname{int}K = \emptyset\).
\(K\) has no isolated points: that is, \(\forall x \in K, \forall \epsilon >0, \exists\ y \in K\) such that \(y\neq x\) and \(y\in B(x,\epsilon)\), where \(B(x,\epsilon):=(x-\epsilon,x+\epsilon)\) is the open ball (interval) of radius \(\epsilon\) around \(x\).
\(x\in K \iff x = \sum_{i=1}^\infty \frac{a_k}{3^k}\) where \((a_k)\) is a sequence with each \(a_k \in \{ 0,2\}\).
\(K\) is uncountable.
Note: a set is called nowhere dense if it’s closure has empty
interior. Thus (1) and (2) imply \(K\) is nowhere dense. In contrast, \(\mathbb{Q}\cap [0,1]\) has empty interior, but is dense in \([0,1]\).
\(E_n\) is closed for each \(n\) so \(K =\bigcap_{n=0}^\infty E_n\) is closed as well.
(By contradiction). Assume \(\operatorname{int}K \neq \emptyset\). Then \(\exists x \in \operatorname{int}K\). Thus, for some \(\delta>0\), \[B(x,\delta)\subseteq \operatorname{int}K \subseteq K.\] Choose \(n\in \mathbb{N}\) large enough such that \(3^{-n} < 2\delta\). We have \[\begin{aligned} (x-\delta,x+\delta) \subseteq K \subseteq E_n. \end{aligned}\] But \(E_n\) is a disjoint union of intervals of length \(3^{-n} < 2\delta\). This is a contradiction.
Key observation: endpoints of intervals in \(E_n\) are not removed in subsequent \(E_m\) (for \(m>n\)). All endpoints, \(\{ 0,1,\tfrac{1}{3},\tfrac{2}{3},\ldots \} \subset K\).
Let \(x\in K\) and \(\epsilon > 0\) be given. We choose \(n\) large enough so that \(3^{-n} < \epsilon\). Then \(x\in E_n \implies x\in [a,b]\) for an interval \([a,b] \subset E_n\) which is one of the \(2^n\) intervals making up \(E_n\). Since the length of each interval is \(3^{-n}\), we have \(b-a < \epsilon\). Then we have both \[|x-b| < \epsilon, \qquad |x-a| < \epsilon.\] Hence, even if \(x=a\) or \(x=b\), we have found \(y\neq x\) so that \(y \in K\cap B(x,\epsilon)\).
We want to show \(x\in K \iff x = \sum_{k=1}^\infty \frac{a_k}{3^k}\) for \(a_k \in \{ 0,2\}\).
First, let us notice and deal with a point of confusionfor Eric
. Consider \(\tfrac{1}{3} \in K\). Then \(\tfrac{1}{3} = \sum_{k=1}^\infty \frac{a_k}{3}\). We could obtain such ternary expansion by \(a_1 = 1\) and \(a_n = 0\) for \(n >1\), which doesn’t fit our criteria. But we can choose other coefficients: Note we have used the geometric series \[ \sum_{k=0}^\infty a^k = \frac{1}{1-a} \] which converges absolutely for \(|a| < 1\); we will use this several times.
\[\sum_{k=2}^\infty \frac{2}{3^n} = 2\left(\frac{1}{9}\right) \sum_{k=0}^\infty \left(\frac{1}{3}\right)^k = \frac{2}{3^2} \frac{1}{1-(1/3)} = \frac{1}{3} .\] In other words, ternary expansions are not unique. However, ternary expansions which only contain \(0\)’s and \(2\)’s are unique. That is the content of the following lemmaLemma 1 If \[\begin{aligned} \sum_{k=1}^\infty \frac{a_k}{3^k} = \sum_{k=1}^\infty \frac{b_k}{3^k}\end{aligned}\] where each \(a_k,b_k \in \{0,2\}\), then \(a_k=b_k\) for all \(k\in \mathbb{N}\).Suppose not; then there is a minimal \(N\) such that \(a_n\neq b_n\). Then one of \(a_n\) or \(b_n\) is zero and the other two; wlog take \(a_n=0\), \(b_n=2\). Expanding our sums, \[ \sum_{k=1}^{N-1} \frac{a_k}{3^k} +0 + \sum_{k=N+1}^\infty \frac{a_k}{3^k} = \sum_{k=1}^{N-1} \frac{b_k}{3^k} + \frac{2}{3^N}+\sum_{k=N+1}^\infty \frac{b_k}{3^k}. \] Using \(a_k=b_k\) for \(k<N\), \[ \sum_{k=N+1}^\infty \frac{a_k}{3^k} = \frac{2}{3^N}+\sum_{k=N+1}^\infty \frac{b_k}{3^k}. \qquad(1)\] But \[ \sum_{k=N+1}^\infty \frac{a_k}{3^k} \leq 2 \sum_{k=N+1}^\infty 3^{-k} = 2\cdot3^{-N-1} \sum_{k=0}^\infty 3^{-k} =2\cdot 3^{-N-1} \frac{1}{1-(1/3)}=3^{-N}. \] So, returning to (1), we have \[ \frac{2}{3^N} + \sum_{k=N+1}^\infty \frac{b_k}{3^k} \leq \frac{1}{3^N} \implies \sum_{k=N+1}^\infty \frac{b_k}{3^k} \leq -\frac{1}{3^N}, \] which of course is a contradiction, since \(b_k\in\{0,2\}\).
Now, let \(S = \{ \sum_{k=1}^\infty \frac{a_k}{3^k}: a_k \in \{0,2\} \}\). Note that \(S\) is the set of numbers which have a ternay expansion without any 1’s, not the set of numbers for which every ternary expansion contains no 1’s.
If \(x\in S\), then \(x = \sum_{k=1}^\infty \frac{a_k}{3^k}\) for some \(a_k\in\{0,2\}\). Let \(x_n= \sum_{k=1}^n \frac{a_k}{3^k}\) be the \(n^{\text{th}}\) partial sum.
Lemma 2 For each \(n\), the partial sum \(x_n\) is a left-endpoint of an interval of \(E_n\).By induction on \(n\). Either \(x_1 = 0 \text{ or } \frac{2}{3}\), either of which is a left endpoint of \(E_1\).
Assume that \(x_{n-1}\) is a left endpoint of \(E_{n-1}\). Then \[[x_{n-1},x_{n-1} + \frac{1}{3^{n-1}} ] \subset E_{n-1}.\] Then by construction of \(E_n\), we can take out the middle thirds to obtain intervals of \(E_n\): \[[x_{n-1},x_{n-1} + \frac{1}{3^n}] \cup [x_{n-1} + \frac{2}{3^n}, x_{n-1} + \underbrace{\frac{1}{3^{n-1}}}_{=\frac{3}{3^n}} ] \subset E_n.\] So \(x_n = x_{n-1} + \frac{a_n}{3^n}\) is a left endpoint of an interval in \(E_n\), whether \(a_n=0\) or \(a_n=2\).
Now, since each \(x_n\) is a left-endpoint of \(E_n\) and we don’t remove endpoints, we have \(x_n\in K\) for all \(n\). But \(K\) is closed so it contains all its limit points, and we have \(x:= \lim_{n\to\infty}x_n \in K\). Thus, \(S \subset K\).
To show that \(K\subset S\), we will prove the contrapositive \[ x\not\in S \implies x \not \in K, \] in order to access properties of \(S\). First notice the following property.
Lemma 3 The intervals removed to obtain \(K\) are all of the form \[I_{k,m} = \left( \frac{3k+1}{3^m}, \frac{3k+2}{3^m} \right): \quad m\in \mathbb{N}, \quad 0 \leq k \leq 3^{m-1} - 1.\] In fact, \[\begin{aligned} [0,1] \setminus E_n = \bigcup_{m=1}^n \bigcup_{k=0}^{3^{m-1}-1} I_{k,m}.\end{aligned}\]The proof, by induction on \(n\), is left as an exercise.
Let us proceed to the proof of the contrapositive. Suppose \(x \not \in S\). Suppose \(x\) contains a ‘1’ in its \(n^{\text{th}}\) digit of its ternary expansion, i.e. \[x = \sum_{k=1}^{n-1} \frac{a_k}{3^k} + \frac{1}{3^n} + \sum_{k=n+1}^\infty \frac{a_k}{3^k}.\] We will take \(n\) to be the first digit which is ‘1’, if there are more than one.
Lemma 4 We have that \(0<\sum_{k=n+1}^\infty \frac{a_k}{3^k} < \frac{1}{3^n}\).If \(a_k = 0\) for all \(k>n\), then our number written in ternary is of the form \(x=0.\star1000\dotsm\), where \(\star\) represents the arbitrary first \(n-1\) digits (each in \(\{0,2\}\)). Then we could choose the expansion \(x=0.\star02222\dotsm\) instead, and avoid having a 1. In other words, \(x\in S\). Similarly, if \(a_k=2\) for all \(k>n\), then we are writing \(x=0.\star1222\dotsm\), so \(x\) admits the expansion \(x=0.\star2222\dotsm\) (using our geometric series again), and hence, \(x\in S\). Thus, since \(x\not \in S\), this tail condition holds.
Now, we may rewrite \[\sum_{k=1}^{n-1} \frac{a_k}{3^k} = \sum_{k=1}^{n-1} \frac{ 3(3^{n - k -1} a_k)}{3^n} = \frac{ 3 \lambda}{3^n}\] where \(\lambda = \sum_{k=1}^{n-1} 3^{n-k-1}a_k \in \mathbb{N}\cup \{0\}\). With this definition, Lemma 4 yields \[\frac{3\lambda + 1}{3^n} < x < \frac{3 \lambda + 2}{3^n}.\]
So \(x \in I_{\lambda, n}\) and thus \(x \not \in K\) by Lemma 3, as we wanted.
\(K\) is uncountable. i.e. \(| K | = |[0,1]|\). We already know \(K\subset [0,1]\), so we know \(|K| \leq |[0,1]|\). Hence, we need to find a surjection from \(K \to [0,1]\) to complete the proof.
We define a function \(F : K \to [0,1]\), such that \(F\) takes \(x\in K\) and changes all of the 2’s in its ternary expansion to 1s and then outputs the number with corresponding binary expansion. We have that \[K = \{ \sum_{k=1}^\infty \frac{a_k}{3^k}: a_k \in \{0,2\} \}.\] For \(x\in K\), we write \(x= \sum_{k=1}^\infty \frac{a_k}{3^k}\) where \(a_k\in\{0,2\}\). Then we define \[F( x) = \sum_{k=1}^\infty\frac{a_k}{2} \frac{1}{2^k}\] Since by Lemma 1 we have a unique ternary expansion of \(x\) that contains no 2’s, we have a well-defined function (there is no ambiguity in how to compute \(F(x)\) given \(x\)). Note: the lower bound of \(F\) on \(K\) is \(0\) (all \(a_k =0\)) and the upper bound is \(1\), achieved when all the \(a_k\) are 1s, using \(\sum \frac{1}{2^k} = 1\).
Let \(y \in [0,1]\). Then \(y\) admits a binary expansion: \(y = \sum_{k=1}^\infty \frac{b_k}{2^k}\) for \(b_k \in \{0,1\}\). Then let \(x = \sum_{k=1}^\infty \frac{2 b_k}{3^k}\), so that \(F(x) = \sum_{k=1}^\infty \frac{b_k}{2^k} = y\).
So we are done.
Let’s notice that the binary expansion of \(y\) does have ambiguity, and thus \(F\) is not injective. For example, \(\frac{1}{2}=0.0111\dotsm\) in binary, and \(\frac{1}{2}=0.100\dotsm\) in binary. Thus \(x_1= 0.022\dotsm\) in ternary and \(x_2 = 0.200\dotsm\) in ternary each have \(F(x_1) = F(x_2) = \frac{1}{2}\). But \(x_1 = \frac{1}{3}\) and \(x_2 = \frac{2}{3}\). This is not a problem; we only needed to find a surjection from \(K\) to \([0,1]\).
Cantor’s function
We defined \(F: K\to [0,1]\), and we wish to extend it to the whole
interval \([0,1]\). Let \(x\) and \(y\) be the left and right endpoints
respectively of the same removed interval \((x,y)\) which was removed in
step \(n\). Then, by Lemma 3Recall our ternary expansion excludes 1s, so we expand \(\frac{1}{3^m} = \sum_{k=m+1}^\infty 3^{-k}\).
,
\[\begin{aligned}
x &= \sum_{k=1}^n a_k 3^{-k} + 0\cdot 3^{-n} + \sum_{k=n+1}^\infty 2\cdot 3^{-k},\\
y&= \sum_{k=1}^n a_k 3^{-k}+ 2\cdot 3^{-n} + \sum_{k=n+1}^\infty 0\cdot 3^{-k}.\end{aligned}\]
Then,
\[
F(x) = \sum_{k=1}^n \frac{a_k}{2} 2^{-k} + 0\cdot 2^{-n} + \sum_{k=n+1}^\infty 1\cdot 2^{-k}.
\]
But \(\sum_{k=n+1}^\infty 2^{-k}=2^{-n}\), so
\[
\begin{aligned}
F(x)&=\sum_{k=1}^n \frac{a_k}{2} 2^{-k} + 2^{-n}. \\
F(y) &=\sum_{k=1}^n \frac{a_k}{2} 2^{-k}+ 1\cdot 2^{-n} + \sum_{k=n+1}^\infty 0\cdot 2^{-k}\\
&= F(x)
\end{aligned}\]
So \(F\) takes on the same values on the endpoints of the removed intervals. But \(K\) together with the removed intervals is \([0,1]\).
Cantor’s function; image taken from wikimedia.
This leads to a natural extension simply by taking \(F\) constant on the removed intervals. That is, we may define a function \(C: [0,1]\to [0,1]\) as \(C(x)=F(x)\) for \(x\in K\), and if \(y\in (a,b)\) a removed interval, \(C(y) = F(a)=F(b)\). This is Cantor’s function. See Figure 5 for a plot of \(C(x)\).
We will prove three properties of \(C\), as follows.
Theorem 2
\(C\) is increasing,
\(C\) is continuous,
\(C\) has zero derivative on \([0,1]\setminus K\).
First we will show that \(C\) is increasing on \(K\), and then extend to \([0,1]\).
Let \(x,y\in K\) with \(x<y\) and set \(x = \sum_{k=1}^\infty x_k 3^{-k}\) and \(y = \sum_{k=1}^\infty y_k 3^{-k}\) their canonical ternary expansionsMeaning expansions with \(x_k\in \{0,2\}\) for all \(k\in \mathbb{N}\).
. If \(x<y\) then they lie in the same interval until some \(n\), and at that point \(y\) is in a later interval. That is, for some \(n\), \(x_n < y_n\), and for all \(k< n\), \(x_k = y_k\). But if \(x_n < y_n\), since both are contained in \(\{0,2\}\), then \(x_n=0\) and \(y_n=2\). So we have \[\begin{aligned} C(x) &= \sum_{k=1}^\infty \frac{x_k}{2} 2^{-k} = \sum_{k=1}^{n-1} \frac{x_k}{2} 2^{-k} + \sum_{k=n+1}^{\infty} \frac{x_k}{2} 2^{-k}, \\ &\leq \sum_{k=1}^{n-1} \frac{y_k}{2} 2^{-k} + 2^{-n}, \\ &\leq \sum_{k=1}^{n-1} \frac{y_k}{2} 2^{-k} + 2^{-n} +\sum_{k=n+1}^{\infty} \frac{y_k}{2} 2^{-k} = C(y).\end{aligned}\] Now, let \(x,y \in [0,1]\), with \(x<y\) and assume \(C(x)>C(y)\). We know then there is some \(x',y' \in K\) with \(C(x) = C(x')\) and \(C(y) = C(y')\), choosing \(x'\) to be the left endpoint of the removed interval to which \(x\) belongs, and similarly with \(y'\). Then, since \(C(x)\neq C(y)\), \(x'\) and \(y'\) must not correspond to the endpoints of the same interval.So then since \(x<y\), we must have \(x'<y'\) as shown in Figure 6. But then since \(x',y'\in K\) we must have \(C(x)= C(x') \leq C(y') = C(y)\), since we showed \(C\) was increasing on \(K\). This is a contradiction. Hence, for \(x,y\in K\) with \(x < y\), we have \(C(x)\leq C(y)\); that is, \(C\) is increasing.
Next, \(C\) is continuous on \([0,1]\). Clearly, \(C\) is continuous at \(x\in [0,1]\setminus K\), because there on a neighborhood of \(x\), \(C\) is constant.
So, let \(x\in K\). We will first show that if \(y\in K\) is close to \(x\), then \(C(x)\) is close to \(C(y)\). WLOG, take \(y>x\) and \(y-x < 3^{-m}\), then, choosing \(\{x_k\}\) and \(\{y_k\}\) as the respective coefficients of \(x\) and \(y\)’s canonical ternary expansions, we have \[\begin{aligned} y-x &= \sum_{k=1}^\infty (y_k - x_k) 3^{-k} < 3^{-m} \\\end{aligned}\]
Then we must have for all \(k< m\), \(x_k = y_k\). To see this, suppose not; let \(\ell < m\) be the minimal index such that \(x_\ell \neq y_\ell\). We have two cases. First, let \(x_\ell = 2\), \(y_\ell = 0\). Then \[\begin{aligned} y - x&= 0 + -2 \cdot 3^{-\ell} + \sum_{k=\ell+1}^\infty \leq -2 \cdot 3^{-\ell} + 3^{-\ell} < 0\end{aligned}\] which is a contradiction. On the other hand, if \(y_\ell =2\) and \(x_\ell=0\), \[\begin{aligned} y-x &= 2\cdot 3^{-\ell} + \sum_{k=\ell+1}^\infty \geq 2 \cdot 3^{-\ell} - 3^{-\ell}= 3^{-\ell} > 3^{-m},\end{aligned}\] a contradiction.
Using this, \[\begin{aligned} C(y) - C(x) &= \sum_{k=1}^\infty \frac{1}{2}(y_k - x_k) 2^{-k} = \sum_{k=m}^\infty \frac{1}{2}(y_k - x_k) 2^{-k} \leq 2^{-m} \end{aligned}\] Clearly then, for any \(\epsilon>0\), we can choose \(m\) large enough such that \(2^{-m}<\epsilon\).
Now, if \(y\not \in K\), then we know there exists \(y'\in K\) such that \(C(y) = C(y')\) and \(|x-y'|\leq |x-y|\), by choosing \(y'\) to be the endpoint of the removed interval that \(y\) resides within; there are two such endpoints, so we choose the one closer to \(x\) so that \(|x-y'|\leq |x-y|\); see Figure 7. But then for \(\epsilon>0\) we know there is a \(\delta>0\) such that if \(|x-y'|< \delta\), then \(|C(x)-C(y')|<\epsilon\). But \(C(y)=C(y')\), so just take \(|x-y|<\delta\).
We choose \(x'\) to be the endpoint closer to \(y\) than \(x\).
For the third property, if we take \(x\in [0,1]\setminus K\), we notice that \(C\) is constant on an open interval around \(x\), so has derivative zero: \[\begin{aligned} \lim_{t\to 0} \left| \frac{C(x+t) - C(t)}{t} \right| = \lim_{t\to 0} \left| \frac{0}{t} \right|=0.\end{aligned}\]