Metric Spaces

The main concepts of real analysis on \(\real\) can be carried over to a general set \(M\) once a notion of distance \(d(x,y)\) has been defined for points \(x,y\in M\). When \(M=\real\), the distance we have been using all along is \(d(x,y) = |x-y|\). The set \(\real\) along with the distance function \(d(x,y)=|x-y|\) is an example of a metric space.
Let \(M\) be a non-empty set. A metric on \(M\) is a function \(d:M\times M \rightarrow [0,\infty)\) satisfying the following properties:
  1. \(d(x,y) = 0\) if and only if \(x=y\)
  2. \(d(x,y) = d(y,x)\) for all \(x, y \in M\) (symmetry)
  3. \(d(x,y) \leq d(x,z) + d(z,y)\) for all \(x,y,z\in M\) (triangle inequality)
A metric space is a pair \((M,d)\) where \(d\) is a metric on \(M\).
If the metric \(d\) is understood, then we simply refer to \(M\) as a metric space instead of formally referring to the pair \((M,d)\).
The set \(M=\real\) and function \(d(x,y) = |x-y|\) is a metric space. To see this, first of all \(|x-y| = 0\) iff \(x-y=0\) iff \(x=y\). Second of all, \(|x-y| = |-(y-x)| = |y-x|\), and finally by the usual triangle inequality on \(\real\) we have \[ d(x,y) = |x-y| = |x-z + y-z| \leq |x-z| + |z-y| = d(x,z) + d(z,y) \] for all \(x,y,z\in \real\).
Let \(\mathcal{B}([a,b])\) denote the set of bounded functions on the interval \([a,b]\), that is, \(f\in\mathcal{B}([a,b])\) if there exists \(M \gt 0\) such that \(|f(x)|\leq M\) for all \(x\in [a,b]\). For \(f, g\in \mathcal{B}([a,b])\) let \[ d(f, g) = \sup_{x\in [a,b]} |f(x) - g(x)|. \] We claim that \((\mathcal{B}([a,b]), d)\) is a metric space. First of all, if \(f,g\in \mathcal{B}([a,b])\) then using the triangle inequality it follows that \((f-g)\in\mathcal{B}([a, b])\). Therefore, \(d(f, g)\) is well-defined for all \(f,g\in \mathcal{B}([a,b])\). Next, by definition, we have that \(0\leq d(f, g)\) and it is clear that \(d(f, g) = d(g, f)\). Lastly, for \(f,g,h\in \mathcal{B}([a,b])\) since \[ |f(x)-g(x)| \leq |f(x)-h(x)| + |h(x)-g(x)| \] then \begin{align*} d(f,g) &= \sup_{x\in [a,b]} |f(x)-g(x)|\\[2ex] &\leq \sup_{x\in [a,b]} ( |f(x)-h(x)| + |h(x)-g(x)|) \\[2ex] &\leq \sup_{x\in [a,b]} ( |f(x)-h(x)|) + \sup_{x\in[a,b]} (|h(x)-g(x)|) \\[2ex] &= d(f,h) + d(h,g). \end{align*} This proves that \((\mathcal{B}[a,b], d)\) is a metric space. It is convention to denote the metric \(d(f, g)\) as \(d_\infty(f,g)\), and we will follow this convention.
Let \(M\) be a non-empty set. Define \(d(x,y) = 1\) if \(x\neq y\) and \(d(x,y) = 0\) if \(x=y\). It is straightforward to show that \((M,d)\) is a metric space. The metric \(d\) is called the discrete metric and \((M,d)\) is a discrete space.
Let \((M, d)\) be a metric space and let \(M'\subset M\) be a non-empty subset. Let \(d'\) be the restriction of \(d\) onto \(M'\), that is, \(d':M'\times M'\rightarrow [0,\infty)\) is defined as \(d'(x,y) = d(x,y)\) for \(x,y\in M'\). Then \((M', d')\) is a metric space. We may therefore say that \(M'\) is a metric subspace of \(M\).
Let \(C([a,b])\) denote the set of continuous functions on the inteval \([a, b]\). Then \(C([a,b]) \subset \mathcal{B}([a, b])\) and thus \((C([a,b]), d_\infty)\) is a metric subspace of \((\mathcal{B}([a, b]), d_\infty)\).
For \(f,g\in C([a, b])\) let \(d(f,g) = \int_a^b |f(t)-g(t)|\,dt\). Prove that \(d\) defines a metric on \(C([a, b])\).
Let \(\real^{n\times n}\) denote the set of \(n\times n\) matrices with real entries. For \({A},{B}\in \real^{n\times n}\) define \[ d({A},{B}) = \max_{1\leq i,j\leq n} |a_{i,j} - b_{i,j}|. \] It is clear that \(d({A},{B}) = d({B},{A})\) and \(d({A},{B}) = 0\) if and only if \({A}={B}\). For \({A},{B},{C}\in\real^{n\times n}\) we have that \begin{align*} d({A},{B}) &= \max_{1\leq i,j\leq n} |a_{i,j} - b_{i,j}|\\[2ex] &= \max_{1\leq i,j\leq n} |a_{i,j} - c_{i,j} + c_{i,j} - b_{i,j}|\\[2ex] &\leq \max_{1\leq i,j\leq n}\left( |a_{i,j}-c_{i,j}| + |c_{i,j}-b_{i,j}| \right)\\[2ex] &\leq \max_{1\leq i,j\leq n} |a_{i,j}-c_{i,j}| + \max_{1\leq i,j\leq n} |c_{i,j}-b_{i,j}|\\[2ex] &= d({A},{C}) + d({C},{B}). \end{align*} Hence, \((\real^{n\times n}, d)\) is a metric space.
An important class of metric spaces are normed vector spaces.
Let \(V\) be a vector space over \(\real\) (or \(\mathbb{C}\)). A norm on \(V\) is a function \(\psi:V\rightarrow [0,\infty)\) satisfying the following properties:
  1. \(\psi({x}) =0\) if and only if \({x}=0\),
  2. \(\psi(\alpha{x}) = |\alpha| \psi({x})\) for any scalar \(\alpha \in \real\) and any \({x}\in V\), and
  3. \(\psi({x}+{y}) \leq \psi({x}) + \psi({y})\) for all \({x},{y} \in V\).
The number \(\psi({x})\) is called the norm of \({x}\in V\). A vector space \(V\) together with a norm \(\psi\) is called a normed vector space.
Instead of using the generic letter \(\psi\) to denote a norm, it is convention to use instead \(\|\cdot \|\). Hence, using \(\|{x}\|\) to denote the norm \(\psi({x})\), properties (i)-(iii) are:
  1. \(\norm{{x}} =0\) if and only if \({x}=0\),
  2. \(\norm{\alpha{x}} = |\alpha| \norm{{x}}\) for any scalar \(\alpha \in \real\) and any \({x}\in V\), and
  3. \(\norm{{x}+{y}} \leq \norm{{x}} + \norm{{y}}\) for all \({x},{y} \in V\).
Let \((V,\norm{\cdot})\) be a normed vector space and define \(d:V\times V\rightarrow [0,\infty)\) by \[ d({x},{y}) = \norm{{x}-{y}}. \] It is a straightforward exercise (which you should do) to show that \((V, d)\) is a metric space. Hence, every normed vector space induces a metric space.
The real numbers \(V=\real\) form a vector space over \(\real\) under the usual operations of addition and multiplication. The absolute value function \(x\mapsto |x|\) is a norm on \(\real\). The induced metric is then \((x,y) \mapsto |x-y|\).
The Euclidean norm on \(\real^n\) is defined as \[ \norm{{x}}_2 = \sqrt{ x_1^2 + x_2^2 + \cdots + x_n^2} \] for \({x}=(x_1,x_2, \ldots, x_n)\in\real^n\). It can be verified that \(\norm{\cdot}_2\) is indeed a norm on \(\real^n\). Hence, we define the distance between \({x},{y}\in\real^n\) as \[ \norm{{x}-{y}}_2 = \sqrt{(x_1-y_1)^2 + (x_2-y_2)^2 + \cdots + (x_n-y_n)^2}. \] Notice that when \(n=1\), \(\norm{\cdot}_2\) is the absolute value function since \(\norm{x}_2 = \sqrt{x^2} = |x|\) for \(x\in\real\). When not specified otherwise, whenever we refer to \(\real^n\) as a normed vector space we implicitly assume that the norm is \(\norm{\cdot}_2\) and simply use the notation \(\norm{\cdot}\).
The set \(\mathcal{B}([a, b])\) of bounded functions forms a vector space over \(\real\) with addition defined as \((f+g)(x) = f(x) + g(x)\) for \(f,g\in \mathcal{B}([a, b])\) and scalar multiplication defined as \((\alpha f)(x) = \alpha f(x)\) for \(\alpha\in\real\) and \(f\in \mathcal{B}([a, b])\). For \(f\in \mathcal{B}([a, b])\) let \[ \norm{f}_\infty = \sup_{a\leq x\leq b} |f(x)|. \] It is left as an (important) exercise to show that \(\norm{\cdot}_\infty\) is indeed a norm on \(\mathcal{B}([a, b])\). The induced metric is \[ d_\infty(f,g) = \norm{f-g}_\infty = \sup_{a\leq x\leq b} |f(x) - g(x)|. \] The norm \(\norm{f}_\infty\) is called the sup-norm of \(f\). Notice that the metric in Example 9.1.3 is induced by the norm \(\norm{\cdot}_\infty\).
Two examples of norms on \(C([a, b])\) are \[ \norm{f}_1 = \int_a^b |f(x)|\, dx \] and \[ \norm{f}_2 = \left(\int_a^b |f(x)|^2\,dx\right)^{1/2}. \] These norms are important in the analysis of Fourier series.
For \({A}\in\real^{n\times n}\) let \[ \norm{{A}}_\infty = \max_{1\leq i,j\leq n} |a_{i,j}|. \] It is left as an exercise to show that \(\norm{\cdot}_\infty\) defined above is a norm on \(\real^{n\times n}\).
Let \((M,d)\) be a metric space. For \(x\in M\) and \(r \gt 0\), the open ball centered at \(x\) of radius \(r\) is by definition the set \[ B_r(x) = \{y\in M\;|\; d(x,y) \lt r\}. \]
Interpret geometrically the open balls in the normed spaces \((\real^n, \norm{\cdot})\) for \(n\in \{1,2,3\}\).
Give a graphical/geometric description of the open balls in the normed space \((C([a, b]), \norm{\cdot}_\infty)\).
A subset \(S\) of a metric space \(M\) is called bounded if \(S\subset B_r(x)\) for some \(x\in M\) and \(r \gt 0\).
Let \((M,d)\) be a metric space. Prove that if \(S\) is bounded then there exists \(y\in S\) and \(r \gt 0\) such that \(S\subset B_r(y)\).

Exercises

Let \(H\) be the set of all real sequences \(x=(x_1,x_2,x_3,\ldots)\) such that \(|x_n|\leq 1\) for all \(n\in \N\). For \(x,y\in H\) let \[ d(x, y) = \sum_{n=1}^\infty 2^{-n} |x_n-y_n|. \] Prove that \(d\) is a metric on \(H\). Note: Part of what you have to show is that \(d(x,y)\) is well-defined which means to show that \(\sum_{n=1}^\infty 2^{-n} |x_n-y_n|\) converges if \(x, y \in H\).

Sequences and Limits

Let \(M\) be a metric space. A sequence in \(M\) is a function \(z:\N\rightarrow M\). As with sequences of real numbers, we identify a sequence \(z:\N\rightarrow M\) with the infinite list \((z_n) = (z_1, z_2, z_3, \ldots)\) where \(z_n = z(n)\) for \(n\in\N\).
Let \((M,d)\) be a metric space. A sequence \((z_n)\) in \(M\) is said to converge if there exists \(p\in M\) such that for any given \(\eps \gt 0\) there exists \(K\in \N\) such that \(d(z_n, p) \lt \eps\) for all \(n\geq K\). In this case, we write \[ \lim_{n\rightarrow\infty} z_n = p \] or \((z_n)\rightarrow p\), and we call \(p\) the limit of \((z_n)\). If \((z_n)\) does not converge then we say it is divergent.
One can indeed show, just as in Theorem 3.1.12 for sequences of real numbers, that the point \(p\) in Definition 9.2.1 is indeed unique.
Suppose that \((z_n)\) converges to \(p\) and let \(x_n = d(z_n, p) \geq 0\). Hence, \((x_n)\) is a sequence of real numbers. If \((z_n)\rightarrow p\) then for any \(\eps \gt 0\) there exists \(K\in\N\) such that \(x_n \lt \eps\). Thus, \(x_n=d(z_n,p)\rightarrow 0\). Conversely, if \(d(z_n,p)\rightarrow 0\) then clearly \((z_n)\rightarrow p\).
Prove that a sequence \((f_n)\) converges to \(f\) in the normed vector space \((\mathcal{B}([a, b], \|\cdot\|_\infty)\) if and only if \((f_n)\) converges uniformly to \(f\) on \([a, b]\).
Several of the results for sequences of real numbers carry over to sequences on a general metric space. For example, a sequence \((z_n)\) in \(M\) is said to be bounded if the set \(\{z_n\;|\; n\in\N\}\) is bounded in \(M\). Then:
In a metric space, a convergent sequence is bounded.
Suppose that \(p\) is the limit of \((z_n)\). There exists \(K\in\N\) such that \(d(z_n, p) \lt 1\) for all \(n\geq K\). Let \(r =1 + \max\{d(z_1,p),\ldots,d(z_{K-1},p)\}\) and we note that \(r\geq 1\). Then \(\{z_n\;|\; n\in \N\} \subset B_r(p)\). To see this, if \(n\geq K\) then \(d(z_n, p) \lt 1\leq r\) and thus \(z_n \in B_r(p)\). On the other hand, for \(z_j \in \{z_1, \ldots,z_{K-1}\}\) we have that \(d(z_j, p) \leq \max\{d(z_1,p), \ldots, d(z_{K-1},p)\} \lt r\), and thus \(z_j \in B_r(p)\). This proves that \((z_n)\) is bounded.
A subsequence of a sequence \((z_n)\) is a sequence of the form \((y_k) = (z_{n_k})\) where \(n_1 \lt n_2 \lt n_3 \lt \cdots\). Then (compare with Theorem 3.4.5):
Let \(M\) be a metric space and let \((z_n)\) be a sequence in \(M\). If \((z_n)\rightarrow p\) then \((z_{n_k})\rightarrow p\) for any subsequence \((z_{n_k})\) of \((z_n)\).
A sequence \((z_n)\) in \(M\) is called a Cauchy sequence if for any given \(\eps \gt 0\) there exists \(K\in\N\) such that \(d(z_n, z_m) \lt \eps\) for all \(n,m\geq K\). Then (compare with Lemma 3.6.3- 3.6.4):
Let \(M\) be a metric space and let \((z_n)\) be a sequence in \(M\). The following hold:
  1. If \((z_n)\) is convergent then \((z_n)\) is a Cauchy sequence.
  2. If \((z_n)\) is a Cauchy sequence then \((z_n)\) is bounded.
  3. If \((z_n)\) is a Cauchy sequence and if \((z_n)\) has a convergent subsequence then \((z_n)\) converges.
Proofs for (i) and (ii) are left as exercises (see Lemma 3.6.3- 3.6.4). To prove (iii), let \((z_{n_k})\) be a convergent subsequence of \((z_n)\), say converging to \(p\). Let \(\eps \gt 0\) be arbitrary. There exists \(K\in\N\) such that \(d(z_n, z_m) \lt \eps/2\) for all \(n,m\geq K\). By convergence of \((z_{n_k})\) to \(p\), by increasing \(K\) if necessary we also have that \(d(z_{n_k}, p) \lt \eps/2\) for all \(k \geq K\). Therefore, if \(n\geq K\), then since \(n_K \geq K\) then \begin{align*} d(z_n, p) &\leq d(z_n, z_{n_K}) + d(z_{n_K}, p)\\ & \lt \eps/2 + \eps/2 \\ &= \eps. \end{align*} Hence, \((z_n)\rightarrow p\).
The previous lemmas (that were applicable on a general metric space) show that some properties of sequences in \(\real\) are due entirely to the metric space structure of \(\real\). There are, however, important results on \(\real\), most notably the Bolzano-Weierstrass theorem and the Cauchy criterion for convergence, that do not generally carry over to a general metric space. The Bolzano-Weierstrass theorem and the Cauchy criterion rely on the completeness property of \(\real\) and there is no reason to believe that a general metric space comes equipped with a similar completeness property. Besides, the completeness axiom of \(\real\) (Axiom 2.4.6) relies on the order property of \(\real\) (i.e., \(\leq\)) and there is no reason to believe that a general metric space comes equipped with an order. We will have more to say about this in Section 9.4. For now, however, we will consider an important metric space where almost all the results for sequences in \(\real\) carry over (an example of a result not carrying over is the Monotone convergence theorem), namely, the normed vector space \((\real^n, \norm{\cdot})\). Denoting a sequence in \(\real^n\) is notationally cumbersome. Formally, a sequence in \(\real^n\) is a function \(z:\N\rightarrow\real^n\). How then should we denote \(z(k)\) as a vector in \(\real^n\)? One way is to simply write \(z(k) = (z_1(k), z_2(k), \ldots, z_n(k))\) for each \(k\in\N\) and this is the notation we will adopt. It is clear that a sequence \((z(k))\) in \(\real^n\) induces \(n\) sequences in \(\real\), namely, \((z_i(k))\) for each \(i\in\{1,2,\ldots,n\}\) (i.e., the component sequences). The following theorem explains why \(\real^n\) inherits almost all the results for sequences in \(\real\).
Let \((z(k))=(z_1(k), z_2(k), \ldots, z_n(k))\) be a sequence in the normed vector space \((\real^n,\norm{\cdot})\). Then \((z(k))\) converges if and only if for each \(i\in\{1,2,\ldots,n\}\) the component sequence \((z_i(k))\) converges. Moreover, if \((z(k))\) converges then \[ \lim_{k\rightarrow\infty} z(k) = (\lim_{k\rightarrow\infty} z_1(k), \lim_{k\rightarrow\infty} z_2(k), \ldots, \lim_{k\rightarrow\infty} z_n(k)). \]
Suppose first that \((z(k))\) converges, say to \(p=(p_1,p_2,\ldots,p_n)\). For any \(i\in\{1,2,\ldots,n\}\) it holds that \[ |z_i(k) - p_i| \leq \sqrt{(z_1(k)-p_1)^2 + (z_2(k)-p_2)^2 + \cdots + (z_n(k)-p_n)^2} \] in other words, \(|z_i(k) - p_i| \leq \norm{z(k)-p}\). Since \((z(k))\rightarrow p\) then \(\lim_{k\rightarrow\infty} \norm{z(k)-p} = 0\) and consequently \(\lim_{k\rightarrow\infty} |z_i(k)-p_i| = 0\), that is, \(\lim_{k\rightarrow\infty} z_i(k) = p_i\). Conversely, now suppose that \((z_i(k))\) converges for each \(i\in \{1,2,\ldots,n\}\). Let \(p_i=\lim_{k\rightarrow\infty} z_i(k)\) for each \(i\in\{1,2,\ldots,n\}\) and let \(p=(p_1,p_2,\ldots,p_n)\). By the basic limit laws of sequences in \(\real\), the sequence \begin{align*} x_k &= \norm{z(k) - p}\\ & = \sqrt{(z_1(k)-p_1)^2 + (z_2(k)-p_2)^2 + \cdots + (z_n(k)-p_n)^2} \end{align*} converges to zero since \(\lim_{k\rightarrow\infty} (z_i(k)-p_i)^2 = 0\) and the square root function \(x\mapsto \sqrt{x}\) is continuous. Thus, \(\lim_{k\rightarrow\infty} z(k) = p\) as desired.
Every Cauchy sequence in \(\real^n\) is convergent.
Let \((z(k))\) be a Cauchy sequence in \(\real^n\). Hence, for \(\eps \gt 0\) there exists \(K\in\N\) such that \(\norm{z(k)-z(m)} \lt \eps\) for all \(k,m\geq K\). Thus, for any \(i\in\{1,2,\ldots,n\}\), if \(k,m\geq K\) then \[ |z_i(k)-z_i(m)| \leq \norm{z(k)-z(m)} \lt \eps. \] Thus, \((z_i(k))\) is a Cauchy sequence in \(\real\), and is therefore convergent by the completeness property of \(\real\). By Theorem 9.2.7, this proves that \((z(k))\) is convergent.
Every bounded sequence in \((\real^n, \norm{\cdot})\) has a convergent subsequence.
Let \((z(k))\) be a bounded sequence in \(\real^n\). There exists \(x=(x_1,\ldots,x_n)\in \real^n\) and \(r \gt 0\) such that \(z(k) \in B_r(x)\) for all \(k\in \N\), that is, \(\norm{z(k)-x} \lt r\) for all \(k\in \N\). Therefore, for any \(i\in \{1,2\ldots,n\}\) we have \[ |z_i(k) - x_i| \leq \norm{z(k)-x} \lt r,  \forall\; k\in\N. \] This proves that \((z_i(k))\) is a bounded sequence in \(\real\) for each \(i\in \{1,2,\ldots,n\}\). We now proceed by induction. If \(n=1\) then \((z(k))\) is just a (bounded) sequence in \(\real\) and therefore, by the Bolzano-Weierstrass theorem on \(\real\), \((z(k))\) has a convergent subsequence. Assume by induction that for some \(n\geq 1\), every bounded sequence in \(\real^n\) has a convergent subsequence. Let \((z(k))\) be a bounded sequence in \(\real^{n+1}\). Let \((\tilde{z}(k))\) be the sequence in \(\real^n\) such that \(\tilde{z}(k)\in\real^n\) is the vector of the first \(n\) components of \(z(k)\in\real^{n+1}\). Then \((\tilde{z}(k))\) is a bounded sequence in \(\real^n\) (why>. By induction, \((\tilde{z}(k))\) has a convergent subsequence, say it is \((\tilde{z}(k_j))\). Now, the real sequence \(y_j = z_{n+1}(k_j)\in\real\) is bounded and therefore by the Bolzano-Weierstrass theorem on \(\real\), \((y_j)\) has a convergent subsequence which we denote by \((u_\ell) = (y_{j_\ell})\), that is, \(u_{\ell} = z_{n+1}(k_{j_\ell})\). Now, since \(w_\ell = \tilde{z}(k_{j_\ell})\) is a subsequence of the convergent sequence \((\tilde{z}(k_j))\), \((w_\ell)\) converges in \(\real^n\). Thus, each component of the sequence \((z(k_{j_\ell}))\) in \(\real^{n+1}\) is convergent and since \((z(k_{j_\ell}))\) is a subsequence of the sequence \((z(k))\)) the proof is complete.
Let \(M\) be a metric space.
  1. A subset \(U\) of \(M\) is said to be open if for any \(x\in U\) there exists \(\eps \gt 0\) such that \(B_\eps(x)\subset U\).
  2. A subset \(E\) of \(M\) is closed if \(E^c=M\backslash\hspace{-0.3em} E\) is open.
Prove that an open ball \(B_\eps(x)\subset M\) is open. In other words, prove that for each \(y\in B_\eps(x)\) there exists \(\delta \gt 0\) such that \(B_\delta(y) \subset B_\eps(x)\).
Below are some facts that are easily proved; once (a) and (b) are proved use DeMorgan's Laws to prove (c) and (d).
  1. If \(U_1,\ldots,U_n\) is a finite collection of open sets then \(\bigcap_{k=1}^n U_k\) is open.
  2. If \(\{U_k\}\) is collection of open sets indexed by a set \(I\) then \(\bigcup_{k\in I} U_k\) is open.
  3. If \(E_1,\ldots,E_n\) is a finite collection of closed sets then \(\bigcup_{k=1}^n E_k\) is closed.
  4. If \(\{E_k\}\) is collection of closed sets indexed by a set \(I\) then \(\bigcap_{k\in I} E_k\) is closed.
Below is a characterization of closed sets via sequences.
Let \(M\) be a metric space and let \(E\subset M\). Then \(E\) is closed if and only if every sequence in \(E\) that converges does so to a point in \(E\), that is, if \((x_n)\rightarrow x\) and \(x_n\in E\) then \(x\in E\).
Suppose that \(E\) is closed and let \((x_n)\) be a sequence in \(E\). If \(x\in E^c\) then there exists \(\eps \gt 0\) such that \(B_\eps(x)\subset E^c\). Hence, \(x_n \notin B_\eps(x)\) for all \(n\) and thus \((x_n)\) does not converge to \(x\). Hence, if \((x_n)\) converges then it converges to a point in \(E\). Conversely, assume that every sequence in \(E\) that converges does so to a point in \(E\) and let \(x\in E^c\) be arbitrary. Then by assumption, \(x\) is not the limit point of any converging sequence in \(E\). Hence, there exists \(\eps \gt 0\) such that \(B_\eps(x)\subset E^c\) otherwise we can construct a sequence in \(E\) converging to \(x\) (how>. This proves that \(E^c\) is open and thus \(E\) is closed.
Show that \(C([a, b])\) is a closed subset of \(\mathcal{B}([a, b])\).
Let \(M\) be an arbitrary non-empty set and let \(d\) be the discrete metric, that is, \(d(x,y) = 1\) if \(x\neq y\) and \(d(x,y)=0\) if \(x=y\). Describe the converging sequences in \((M,d)\). Prove that every subset of \(M\) is both open and closed.
For \(a\leq b\), prove that \([a,b]=\{x\in\real\;|\; a\leq x\leq b\}\) is closed.

Exercises

Let \(M\) be a metric space and suppose that \((z_n)\) converges in \(M\). Prove that the limit of \((z_n)\) is unique. In other words, prove that if \(p\) and \(q\) satisfy the convergence definition for \((z_n)\) then \(p=q\).
Let \((M,d)\) be a metric space.
  1. Let \(y\in M\) be fixed. Prove that if \((z_n)\) converges to \(p\) then \(\displaystyle\lim_{n\rightarrow\infty} d(z_n, y) = d(p,y)\).
  2. Prove that if \((z_n)\) converges to \(p\) and \((y_n)\) converges to \(q\) then \(\displaystyle\lim_{n\rightarrow\infty} d(z_n, y_n) = d(p,q)\).
Let \(M\) be a metric space. Prove that if \(U_1,\ldots,U_n\subset M\) are open then \(\bigcap_{k=1}^n U_k\) is open in \(M\).
Let \((M, d)\) be a metric space and let \(E\subset M\). A point \(x\in M\) is called a limit point (or cluster point) of \(E\) if there exists a sequence \((x_n)\) in \(E\), with \(x_n\neq x\) for all \(n\), converging to \(x\). The closure of \(E\), denoted by \(\text{cl}({E})\), is the union of \(E\) and the limit points of \(E\). If \(\text{cl}(E) = M\) then we say that \(E\) is dense in \(M\). As an example, \(\mathbb{Q}\) is dense in \(\real\) since every irrational number is the limit of a sequence of rational numbers.
  1. Prove that \(E\) is dense in \(M\) if and only if \(E\cap U\neq \emptyset\) for every open set \(U\) of \(M\).
  2. Let \(E\) be the set of step functions on \([a,b]\). Then clearly \(E \subset \mathcal{B}([a,b])\). Prove that the set of continuous function \(C([a, b])\) is contained in the closure of \(E\). (Hint: See Example 8.2.6)
  3. Perform an internet search and find dense subsets of \((C([a, b]), \norm{\cdot}_\infty)\) (you do not need to supply proofs).
For \(x\in\real^n\) define \(\norm{x}_\infty = \max_{1\leq i\leq n} |x_i|\) and \(\norm{x}_1=\sum_{i=1}^n |x_i|\). It is not hard to verify that these are norms on \(\real^n\). Prove that:
  1. \(\norm{x}_\infty \leq \norm{x}_2 \leq \norm{x}_1\),
  2. \(\norm{x}_1 \leq n\norm{x}_\infty\), and
  3. \(\norm{x}_1 \leq \sqrt{n}\norm{x}_2\)
Two metrics \(d\) and \(\rho\) on a set \(M\) are equivalent if they generate the same convergent sequences, in other words, \((x_n)\) converges in \((M,d)\) if and only if \((x_n)\) converges in \((M,\rho)\). Prove that \(\norm{\cdot}_1, \norm{\cdot}_2, \norm{\cdot}_\infty\) are equivalent norms on \(\real^n\).
How do we (Riemann) integrate functions from \([a, b]\) to \(\real^n\)? Here is how. First, we equip \(\real^n\) with the standard Euclidean norm \(\|\cdot\|_2\). For any function \(F:[a, b]\rightarrow\real^n\) and any tagged partition \(\dot{\mathcal{P}} = \{([t_{k-1}, t_k], c_k)\}_{k=1}^n\) of \([a, b]\), define the Riemann sum \[ S(F;\dot{\mathcal{P}}) = \sum_{k=1}^n F(c_k) (x_k - x_{k-1}). \] We then say that \(F\) is Riemann integrable if there exists \(v\in\real^n\) such that for any \(\eps \gt 0\) there exists \(\delta \gt 0\) such that for any tagged partition \(\dot{\mathcal{P}}\) of \([a, b]\) with \(\|\dot{\mathcal{P}}\| \lt \delta\) it holds that \[ \|S(F;\dot{\mathcal{P}}) - v\|_2 \lt \eps. \] We then also write that \(\int_a^b F = v\). If \(F\) has component functions \(F=(f_1, f_2, \ldots, f_n)\), prove that \(F\) is Riemann integrable on \([a, b]\) if and only if the component functions \(f_1, f_2, \ldots, f_n\) are Riemann integrable on \([a, b]\), and in this case, \[ \int_a^b F = \Big(\int_a^b f_1, \int_a^b f_2, \ldots, \int_a^b f_n\Big). \] Hint: Recall that for any \(x=(x_1, x_2, \ldots, x_n)\in\real^n\) it holds that \(|x_i| \leq \|x\|_2\).
Let \(\real^\infty\) denote the set of infinite sequences in \(\real\). It is not hard to see that \(\real^\infty\) is a \(\real\)-vector space with addition and scalar multiplication defined in the obvious way. Let \(\ell_1\subset\real^\infty\) denote the subset of sequences \(x=(x_1,x_2,x_3,\ldots)\) such that \(\sum_{n=1}^\infty |x_n|\) converges, that is, \(\ell_1\) denotes the set of absolutely convergent series.
  1. Prove that \(\ell_1\) is a subspace of \(\real^\infty\), i.e., prove that \(\ell_1\) is closed under addition and scalar multiplication.
  2. For \(x\in \ell_1\) let \(\norm{x}_1 = \sum_{n=1}^\infty |x_n|\). Prove that \(\norm{\cdot}_1\) defines a norm on \(\ell_1\).

Continuity

Using the definition of continuity for a function \(f:\real\rightarrow\real\) as a guide, it is a straightforward task to formulate a definition of continuity for a function \(f:M_1\rightarrow M_2\) where \((M_1,d_1)\) and \((M_2,d_2)\) are metric spaces.
Let \((M_1,d_1)\) and \((M_2,d_2)\) be metric spaces. A function \(f:M_1\rightarrow M_2\) is continuous at \(x\in M_1\) if given any \(\eps \gt 0\) there exists \(\delta \gt 0\) such that if \(d_1(y,x) \lt \delta\) then \(d_2(f(y),f(x)) \lt \eps\). We say that \(f\) is continuous if it is continuous at each point of \(M_1\).
Using open balls, \(f:M_1\rightarrow M_2\) is continuous at \(x\in M_1\) if for any given \(\eps \gt 0\) there exists \(\delta \gt 0\) such that \(f(y) \in B_\eps(f(x))\) whenever \(y \in B_\delta(x)\). We note that \(B_\eps(f(x))\) is an open ball in \(M_2\) while \(B_\delta(x)\) is an open ball in \(M_1\). Below we characterize continuity using sequences (compare with Theorem 5.1.2).
Let \((M_1,d_1)\) and \((M_2,d_2)\) be metric spaces. A function \(f:M_1\rightarrow M_2\) is continuous at \(x\in M_1\) if and only if for every sequence \((x_n)\) in \(M_1\) converging to \(x\) the sequence \((f(x_n))\) in \(M_2\) converges to \(f(x)\).
Assume that \(f\) is continuous at \(x\in M_1\) and let \((x_n)\) be a sequence in \(M_1\) converging to \(x\). Let \(\eps \gt 0\) be arbitrary. Then there exists \(\delta \gt 0\) such that \(f(y)\in B_\eps(f(x))\) for all \(y\in B_\delta(x)\). Since \((x_n)\rightarrow x\), there exists \(K\in \N\) such that \(x_n \in B_\delta(x)\) for all \(n\geq K\). Therefore, for \(n\geq K\) we have that \(f(x_n) \in B_\eps(f(x))\). Since \(\eps \gt 0\) is arbitrary, this proves that \((f(x_n))\) converges to \(f(x)\). Suppose that \(f\) is not continuous at \(x\). Then there exists \(\eps^* \gt 0\) such that for every \(\delta \gt 0\) there exists \(y\in B_\delta(x)\) with \(f(y) \notin B_{\eps^*}(f(x))\). Hence, if \(\delta_n = \frac{1}{n}\) then there exists \(x_n \in B_{\delta_n}(x)\) such that \(f(x_n) \notin B_{\eps^*}(f(x))\). Since \(d(x_n, x) \lt \delta_n\) then \((x_n)\rightarrow x\). On the other hand, it is clear that \((f(x_n))\) does not converge to \(f(x)\). Hence, if \(f\) is not continuous at \(x\) then there exists a sequence \((x_n)\) converging to \(x\) such that \((f(x_n))\) does not converge to \(f(x)\). This proves that if every sequence \((x_n)\) in \(M_1\) converging to \(x\) it holds that \((f(x_n))\) converges to \(f(x)\) then \(f\) is continuous at \(x\).
A level set of a function \(f:M\rightarrow\real\) is a set of the form \(E = \{ x\in M\;|\; f(x) = k\}\) for some \(k\in\real\). Prove that if \(f\) is continuous then the level sets of \(f\) are closed sets.
As a consequence of Theorem 9.3.2, if \(f\) is continuous at \(p\) and \(\lim_{n\rightarrow\infty}x_n = p\) then \(\lim_{n\rightarrow\infty} f(x_n) = f(p)\) can be written as \[ \lim_{n\rightarrow\infty} f(x_n) = f(\lim_{n\rightarrow\infty} x_n) \] The sequential criteria for continuity can be conveniently used to show that the composition of continuous function is a continuous function.
Let \(f:M_1\rightarrow M_2\) and let \(g:M_2\rightarrow M_3\), where \(M_1, M_2\), and \(M_3\) are metric spaces. If \(f\) is continuous at \(x\in M_1\) and \(g\) is continuous at \(f(x)\) then the composite mapping \((g\circ f): M_1\rightarrow M_3\) is continuous at \(x\in M_1\).
If \(\lim_{n\rightarrow\infty} x_n = p\) then by Theorem 9.3.2, and using the fact that \(f\) is continuous at \(p\), and \(g\) is continuous at \(g(p)\): \begin{align*} (g\circ f)(p) &= g(f(p))\\[2ex] &= g(f(\lim_{n\rightarrow\infty} x_n))\\[2ex] &= g(\lim_{n\rightarrow\infty} f(x_n))\\[2ex] &= \lim_{n\rightarrow\infty} g(f(x_n))\\[2ex] & = \lim_{n\rightarrow\infty} (g\circ f)(x_n) \end{align*}
In general, given functions \(f, g: M_1\rightarrow M_2\) on metric spaces \(M_1\) and \(M_2\), there is no general way to define the functions \(f\pm g\) or \(fg\) since \(M_2\) does not come equipped with a vector space structure nor is it equipped with a product operation. However, when \(M_2=\real\) then \(f(x)\) and \(g(x)\) are real numbers which can therefore be added/subtracted/multiplied.
Let \((M,d)\) be a metric space and let \(f, g:M\rightarrow \real\) be continuous functions, where \(\real\) is equipped with the usual metric. If \(f\) and \(g\) are continuous at \(x\in M\) then \(f+g\), \(f-g\), and \(fg\) are continuous at \(x\in M\).
In all cases, the most economical proof is to use the sequential criterion. The details are left as an exercise.
Recall that for any function \(f:A\rightarrow B\) and \(S\subset B\) the set \(f^{-1}(S)\) is defined as \[ f^{-1}(S) = \{ x\in A\,|\, f(x) \in S\}. \]
For any function \(f:A\rightarrow B\) prove that \((f^{-1}(S))^c = f^{-1}(S^c)\) for any \(S\subset B\).
For a given function \(f:(M_1,d_1)\rightarrow (M_2,d_2)\) the following are equivalent:
  1. \(f\) is continuous on \(M_1\).
  2. \(f^{-1}(U)\) is open in \(M_1\) for every open subset \(U\subset M_2\).
  3. \(f^{-1}(E)\) is closed in \(M_1\) for every closed subset \(E\subset M_2\).
(i) \(\Longrightarrow\) (ii): Assume that \(f\) is continuous on \(M_1\) and let \(U\subset M_2\) be open. Let \(x\in f^{-1}(U)\) and thus \(f(x) \in U\). Since \(U\) is open, there exists \(\eps \gt 0\) such that \(B_\eps(f(x))\subset U\). By continuity of \(f\), there exists \(\delta \gt 0\) such that if \(y\in B_\delta(x)\) then \(f(y) \in B_\eps(f(x))\). Therefore, \(B_\delta(x) \subset f^{-1}(U)\) and this proves that \(f^{-1}(U)\) is open. (ii) \(\Longrightarrow\) (i): Let \(x\in M_1\) and let \(\eps \gt 0\) be arbitrary. Since \(B_\eps(f(x))\) is open, by assumption \(f^{-1}(B_\eps(f(x)))\) is open. Clearly \(x \in f^{-1}(B_\eps(f(x)))\) and thus there exists \(\delta \gt 0\) such that \(B_\delta(x) \subset f^{-1}(B_\eps(f(x)))\), in other words, if \(y\in B_\delta(x)\) then \(f(y) \in B_\eps(f(x))\). This proves that \(f\) is continuous at \(x\). (ii) \(\Longleftrightarrow\) (iii): This follows from the fact that \((f^{-1}(U))^c = f^{-1}(U^c)\) for any set \(U\). Thus, for instance, if \(f^{-1}(U)\) is open for every open set \(U\) then if \(E\) is closed then \(f^{-1}(E^c)\) is open, that is, \((f^{-1}(E))^c\) is open, i.e., \(f^{-1}(E)\) is closed.
Use Proposition 9.3.7 to prove that the level sets of a function \(f:M\rightarrow\real\) on a metric space \(M\) are closed sets.
A function \(f:(M_1,d_1)\rightarrow (M_2,d_2)\) is called Lipschitz on \(M_1\) if there exists \(K \gt 0\) such that \(d_2(f(x),f(y))\leq K d_1(x,y)\) for all \(x,y\in M_1\). Prove that a Lipschitz function is continuous.
For \({A}\in\real^{n\times n}\) recall that \(\norm{{A}}_\infty = \sup_{1\leq i,j\leq n}|a_{i,j}|\). Let \(\tr:\real^{n\times n}\rightarrow\real\) be the trace function on \(\real^{n\times n}\), that is, \(\tr({A}) = \sum_{i=1}^n a_{i,i}\). Show that \(\tr\) is Lipschitz and therefore continuous.
Let \(\ell_\infty\) denote the set of all real sequences \((x_n)\) that are bounded, that is, \(\{|x_n|\;:\; n\in\N\}\) is a bounded set. If \(x=(x_n)\in\ell_\infty\), it is straightforward to verify that \(\norm{x}_\infty = \sup_{n\in \N} |x_n|\) defines a norm on \(\ell_\infty\) with addition and scalar multiplication defined in the obvious way. Let \(\ell_1\) be the set of absolutely summable sequences \((x_n)\), that is, \((x_n)\in \ell_1\) if and only if \(\sum_{n=1}^\infty |x_n|\) converges. It is not too hard to verify that \(\ell_1\) is a normed vector space with norm defined as \(\norm{x}_1 = \sum_{n=1}^\infty |x_n|\). If \(\sum_{n=1}^\infty |x_n|\) converges then \((|x_n|)\) converges to zero and thus \((x_n)\in\ell_\infty\), thus \(\ell_1 \subset \ell_\infty\).
Fix \(y=(y_n)_{n=1}^\infty \in \ell_\infty\) and let \(h:\ell_1\rightarrow \ell_1\) be defined as \(h(x) = (x_n y_n)_{n=1}^\infty\) for \(x=(x_n)_{n=1}^\infty\). Verify that \(h\) is well-defined and prove that \(h\) is continuous.
Let \(\det:\mat{n}\rightarrow\real\) denote the determinant function. Prove that \(\det\) is continuous; you may use the formula \[ \det({A}) = \sum_{\sigma \in S_n} \left(\text{sgn}(\sigma) \prod_{i=1}^n a_{i, \sigma(i)} \right) \] where \(S_n\) is the set of permutations on \(\{1,2,\ldots,n\}\) and \(\text{sgn}(\sigma)=\pm 1\) is the sign of the permutation \(\sigma\in S_n\).

Exercises

Let \((M,d)\) be a metric space. Fix \(y\in M\) and define the function \(f:M\rightarrow\real\) by \(f(x) = d(x,y)\). Prove that \(f\) is continuous.
Let \((V,\norm{\cdot})\) be a normed vector space. Prove that \(f:V\rightarrow \real\) defined by \(f(x) = \norm{x}\) is continuous. Hint: \(\norm{a}\leq \norm{a-b} + \norm{b}\) for all \(a,b,c\in V\).
Consider \(C([a, b])\) with norm \(\norm{\cdot}_\infty\). Define the function\(\Psi: C([a, b])\rightarrow \real\) by \(\Psi(f) = \int_a^b f(x)\, dx\). Prove that \(\Psi\) is continuous in two ways, using the definition and the sequential criterion for continuity.
Let \(M\) be a metric space and let \(f:M\rightarrow\real\) be continuous. Prove that \(E=\{x\in M\;|\; f(x)=0\}\) is closed.
Consider \(\real^{n\times n}\) as a normed vector space with norm \(\norm{\bs{A}}_\infty=\sup_{1\leq i,j\leq n} |a_{i,j}|\) for \(\bs{A}\in\mat{n}\).
  1. Let \((\bs{A}(k))_{k=1}^\infty\) be a sequence in \(\real^{n\times n}\) and denote the \((i,j)\) entry of the matrix \(\bs{A}(k)\) as \(a_{i,j}(k)\). Prove that \((\bs{A}(k))_{k=1}^\infty\) converges to \(\bs{B}\in\real^{n\times n}\) if and only if for all \(i,j\in\{1,2,\ldots,n\}\) the real sequence \((a_{i,j}(k))_{k=1}^\infty\) converges to \(b_{i,j}\in\real\).
  2. Given matrices \(\bs{X},\bs{Y}\in \mat{n}\), recall that the entries of the product matrix \(\bs{X}\bs{Y}\) are \((\bs{X}\bs{Y})_{i,j} = \sum_{\ell=1}^n x_{i,\ell}y_{\ell,j}.\) Let \((\bs{A}(k))_{k=1}^\infty\) be a sequence in \(\mat{n}\) converging to \(\bs{B}\) and let \((\bs{C}(k))_{k=1}^\infty\) be the sequence whose \(k\)th term is \(\bs{C}(k) = \bs{A}(k)\bs{A}(k)=[\bs{A}(k)]^2\). Prove that \((\bs{C}(k))_{k=1}^\infty\) converges to \(\bs{B}^2\). Hint: By part (a), it is enough to prove that the \((i,j)\) component of \(\bs{C}(k)\) converges to the \((i,j)\) component of \(\bs{B}^2\).
  3. Deduce that if \(\bs{C}(k) = [\bs{A}(k)]^m\) where \(m\in\N\) then the sequence \((\bs{C}(k))_{k=1}^\infty\) converges to the matrix \(\bs{B}^m\).
  4. A polynomial matrix function is a function \(f:\real^{n\times n}\rightarrow\real^{n\times n}\) of the form \[ f(\bs{A}) = c_m\bs{A}^m + c_{m-1}\bs{A}^{m-1} + \cdots + c_1 \bs{A} + c_0\bs{I} \] where \(c_m, \ldots, c_0\) are constants and \(\bs{I}\) denotes the \(n\times n\) identity matrix. Prove that a polynomial matrix function is continuous.
According to the sequential criterion for continuity, if \((z_n)\) and \((w_n)\) are sequences in \(M\) converging to the same point \(p\in M\) and \(f:M\rightarrow\real\) is a function such that sequences \((f(z_n))\) and \((f(w_n))\) do not have the same limit \(f(p)\) (or worse one of them is divergent!) then \(f\) is discontinuous at \(p\). Consider \(f:\real^2\rightarrow\real\) given by \[ f(x,y) = \begin{cases} \frac{2xy}{x^2+y^2}, & (x,y)\neq (0,0)\\[2ex] 0, & (x,y)=(0,0).\end{cases} \] Show that \(f\) is discontinuous at \(p=(0,0)\).

Completeness

Consider the space \(\mathcal{P}[a,b]\) of polynomial functions on the interval \([a,b]\). Clearly, \(\mathcal{P}[a,b]\subset C([a, b])\), and thus \((\mathcal{P}[a,b], \norm{\cdot}_\infty)\) is a metric space. The sequence of functions \(f_n(x) = \sum_{k=0}^n \frac{1}{k!} x^k\) is a sequence in \(\mathcal{P}[a,b]\) and it can be easily verified that \((f_n)\) converges in the metric \(\norm{\cdot}_\infty\), that is, \((f_n)\) converges uniformly in \([a,b]\) (see Example 8.4.8). However, the limiting function \(f\) is not an element of \(\mathcal{P}[a,b]\) because it can be verified that \(f'(x) = f(x)\) and the only polynomial equal to its derivative is the zero polynomial, however, it is clear that \(f(0) = \lim_{n\rightarrow\infty} f_n(0) = 1\), i.e., \(f\) is not the zero function (you may recognize, of course, that \(f(x)=e^x\)). We do know, however, that \(f\) is in \(C([a, b])\) because the uniform limit of a sequence of continuous functions is continuous. The set \(\mathcal{P}[a,b]\) then suffers from the same ``weakness'' as do the rationals \(\rat\) relative to \(\real\), namely, there are sequences in \(\mathcal{P}[a,b]\) that converge to elements not in \(\mathcal{P}[a,b]\). On the other hand, because \((f_n)\) converges it is a Cauchy sequence in \(C([a, b])\) and thus also in \(\mathcal{P}[a,b]\) (the Cauchy condition only depends on the metric) and thus \((f_n)\) is a Cauchy sequence in \(\mathcal{P}[a,b]\) that does not converge to an element of \(\mathcal{P}[a,b]\). The following discussion motivates the following definition.
A metric space \(M\) is called complete if every Cauchy sequence in \(M\) converges in \(M\).
This seems like a reasonable starting definition of completeness since in \(\real\) it can be proved that the Cauchy criterion (plus the Archimedean property) implies the Completeness property of \(\real\) (Theorem 3.6.8). Based on our characterization of closed sets via sequences, we have the following first theorem regarding completeness.
Let \((M,d)\) be a complete metric space and let \(P \subset M\). Then \((P, d)\) is a complete metric space if and only if \(P\) is closed.
If \((z_n)\) is a Cauchy sequence in \((P,d)\) then it is also a Cauchy sequence in \((M,d)\). Since \((M,d)\) is complete then \((z_n)\) converges. If \(P\) is closed then by Theorem 9.2.13 the limit of \((z_n)\) is in \(P\). Hence, \((P,d)\) is a complete metric space. Now suppose that \((P,d)\) is a complete metric space and let \((z_n)\) be a sequence in \(P\) that converges to \(z\in M\). Then \((z_n)\) is a Cauchy sequence in \(M\) and thus also Cauchy in \(P\). Since \(P\) is complete then \(z\in P\). Hence, by Theorem 9.2.13 \(P\) is closed.
We now consider how to formulate the Bolzano-Weierstrass (BW) property in a general metric space. The proof in Theorem 3.6.8 can be easily modifield to prove that the BW property, namely that every bounded sequence in \(\real\) has a convergent subsequence, implies the completeness property of \(\real\). We therefore want to develop a BW-type condition in a general metric space \(M\) that implies that \(M\) is complete. Our first order of business is to develop the correct notion of boundedness. We have already defined what it means for a subset \(E\subset M\) to be bounded, namely, that there exists \(r \gt 0\) such that \(E\subset B_r(x)\) for some \(x\in M\). However, this notion is not enough as the next example illustrates.
Consider \(\mathcal{P}[0,1]\) with induced metric \(\norm{\cdot}_\infty\) and let \(E=\{f \in \mathcal{P}[0,1]\,:\, \norm{f}_\infty \lt 3\}\), in other words, \(E\) is the open ball of radius \(r=3\) centered at the zero function. Clearly, \(E\) is bounded and thus any sequence in \(E\) is bounded. The sequence \(f_n(x) = \sum_{k=0}^n \frac{1}{k!} x^k\) is in \(E\), that is, \(\norm{f_n}_\infty \lt 3\) for all \(n\) (see Example 3.3.6). However, as already discussed, \((f_n)\) converges in \(C[0,1]\) but not to a point in \(\mathcal{P}[0,1]\). On the other hand, \((f_n)\) is a Cauchy sequence in \(\mathcal{P}[0,1]\) and thus \((f_n)\) cannot have a converging subsequence in \(\mathcal{P}[0,1]\) by part (iii) of Lemma 9.2.6. Thus, \((f_n)\) is a bounded sequence in \(P[0,1]\) with no converging subsequence in \(P[0,1]\).
The correct notion of boundedness that is needed is the following.
Let \((M,d)\) be a metric space. A subset \(E\subset M\) is called totally bounded if for any given \(\eps \gt 0\) there exists \(z_1,\ldots,z_N\in E\) such that \(E\subset \bigcup_{i=1}^N B_\eps(z_i)\).
Prove that a subset of a totally bounded set is also totally bounded.
A totally bounded subset \(E\) of a metric space \(M\) is bounded. If \(E\subset B_\eps(x_1)\cup\cdots\cup B_\eps(x_k)\) then if \(r=\max_{2\leq j\leq k} d(x_1,x_j)+\eps\) then if \(x\in E\cap B_\eps(x_j)\) then \(d(x_1,x) \leq d(x_1,x_j) + d(x_j,x) \lt r\). Hence, \(E\subset B_r(x_1)\).
The following shows that the converse in the previous example does not hold.
Consider \(\ell_1\) and let \(E=\{e_1, e_2, e_3, \ldots\}\) where \(e_k\) is the infinite sequence with entry \(k\) equal to \(1\) and all other entries zero. Then \(\|e_k\|_1 = 1\) for all \(k\in \N\) and therefore \(E\) is bounded, in particular \(E\subset B_r(0)\) for any \(r \gt 1\). Now, \(\|e_k - e_j\|_1 = 2\) for all \(k\neq j\) and thus if \(\eps\leq 2\) then no finite collection of open balls \(B_\eps(e_{k_1}), B_\eps(e_{k_2}), \ldots, B_\eps(e_{k_N})\) can cover \(E\). Hence, \(E\) is not totally bounded.
Prove that a bounded subset \(E\) of \(\real\) is totally bounded.
Let \(M\) be a metric space. Then \(M\) is complete if and only if every infinite totally bounded subset of \(M\) has a limit point in \(M\).
Suppose that \((M,d)\) is a complete metric space. Let \(E\) be an infinite totally bounded subset of \(M\). Let \(\eps_n=\frac{1}{2^n}\) for \(n\in\N\). For \(\eps_1\) there exists \(z_1,\ldots,z_{m_1}\in E\) such that \(E\subset \bigcup_{j=1}^{m_1} B_{\eps_1}(z_j)\). Since \(E\) is infinite, we can assume without loss of generality that \(E_1 = E\cap B_{\eps_1}(z_1)\) contains infinitely many points of \(E\). Let then \(x_1=z_1\). Now, \(E_1\) is totally bounded and thus there exists \(w_1,\ldots,w_{m_2}\in E_1\) such that \(E_1\subset \bigcup_{j=1}^{m_2} B_{\eps_2}(w_j)\). Since \(E_1\) is infinite, we can assume without loss of generality that \(E_2=E_1\cap B_{\eps_2}(w_1)\) contains infinitely many points of \(E_1\). Let \(x_2=w_1\) and therefore \(d(x_2,x_1) \lt \eps_1\). Since \(E_2\) is totally bounded there exists \(u_1,\ldots,u_{m_3}\in E_2\) such that \(E_2\subset\bigcup_{j=1}^{m_3} B_{\eps_3}(u_j)\). We can assume without loss of generality that \(E_3=E_2\cap B_{\eps_3}(u_1)\) contains infinitely many elements of \(E_2\). Let \(x_3=u_1\) and thus \(d(x_3,x_2) \lt \eps_2\). By induction, there exists a sequence \((x_n)\) in \(E\) such that \(d(x_{n+1}, x_n) \lt \frac{1}{2^n}\). Therefore, if \(m \gt n\) then by the triangle inequality (and the geometric series) we have \begin{align*} d(x_m, x_n) &\leq d(x_m, x_{m-1}) + \cdots + d(x_{n+1}, x_n)\\ & \lt \frac{1}{2^{m-1}} + \cdots + \frac{1}{2^n} \\ & \lt \frac{1}{2^{n-1}}. \end{align*} Therefore, \((x_n)\) is a Cauchy sequence and since \(M\) is complete \((x_n)\) converges in \(M\). Thus \(E\) has a limit point in \(M\). Conversely, assume that every infinite totally bounded subset of \(M\) has a limit point in \(M\). Let \((x_n)\) be a Cauchy sequence in \(M\) and let \(E=\{x_n\;|\; n\in \N\}\). For any given \(\eps \gt 0\) there exists \(K\in\N\) such that \(|x_n - x_K| \lt \eps\) for all \(n\geq K\). Therefore, \(x_n \in B_\eps(x_K)\) for all \(n\geq K\) and clearly \(x_j \in B_\eps(x_j)\) for all \(j=1,2,\ldots, K_1\). Thus, \(E\) is totally bounded. By assumption, \(E\) has a limit point, that is, there exists a subsequence of \((x_n)\) that converges in \(M\). By part (iii) of Lemma 9.2.6, \((x_n)\) converges in \(M\). Thus, \(M\) is a complete metric space.
The proof in Theorem 9.4.9 that completeness implies that every infinite totally bounded subset has a limit point is reminiscent of the bisection method proof that a bounded sequence in \(\real\) contains a convergent subsequence. Also, the proof showed the following.
If \(E\) is an infinite totally bounded subset of \((M,d)\) then \(E\) contains a Cauchy sequence \((x_n)\) such that \(x_n \neq x_m\) for all \(n\neq m\).
A complete normed vector space is usually referred to as a Banach space in honor of Polish mathematician Stefan Banach (1892-1945) who, in his 1920 doctorate dissertation, laid the foundations of these spaces and their applications in integral equations. An important example of a Banach space is the following. Let \(X\) be a non-empty set and let \(\mathcal{B}(X)\) be the set of bounded functions from \(X\) to \(\real\) with sup-norm \(\norm{f}_\infty = \sup_{x\in X} |f(x)|\). Then convergence in \((\mathcal{B}(X), \norm{\cdot}_\infty)\) is uniform convergence (Example 9.2.3). We have all the tools necessary to prove that \(\mathcal{B}(X)\) is a Banach space.
Let \(X\) be a non-empty set. The normed space \((\mathcal{B}(X), \norm{\cdot}_\infty)\) is a Banach space.
First of all, it is clear that \(\mathcal{B}(X)\) is a real vector space and thus we need only show it is complete. The proof is essentially contained in the Cauchy criterion for uniform convergence for functions on \(\real\) (Theorem 8.2.7). Let \(f_n:X\rightarrow\real\) be a Cauchy sequence of bounded functions. Then for any given \(\eps \gt 0\) there exists \(K\in\N\) such that if \(n,m\geq K\) then \(\norm{f_n-f_m}_\infty \lt \eps\). In particular, for any fixed \(x\in X\) it holds that \[ |f_n(x)-f_m(x)| \leq \norm{f_n-f_m}_\infty \lt \eps. \] Therefore, the sequence of real numbers \((f_n(x))\) is a Cauchy sequence and thus \(f(x) = \lim_{n\rightarrow\infty} f_n(x)\) exists for each \(x\in X\). Now, since \((f_n)\) is a Cauchy sequence in \(\mathcal{B}(X)\) then \((f_n)\) is bounded in \(\mathcal{B}(X)\). Thus, there exists \(M \gt 0\) such that \(\|f_n\|_\infty\leq M\) for all \(n\geq 1\). Thus, for all \(x\in X\) and \(n\geq 1\) it holds that \[ |f_n(x)| \leq \|f_n\|_\infty \leq M \] and by continuity of the absolute value function it holds that \[ |f(x)| = \lim_{n\rightarrow\infty} |f_n(x)| \leq M. \] Thus, \(f\) is a bounded function, that is, \(f\in\mathcal{B}(X)\). Now, for any fixed \(\eps \gt 0\), let \(K\in\N\) be such that \(|f_n(x)-f_m(x)| \lt \eps/2\) for all \(x\in X\) and \(n,m\geq K\). Therefore, for any \(x\in X\) we have that \begin{align*} \lim_{m\rightarrow\infty} |f_n(x)-f_m(x)| &= |f_n(x)-f(x)|\\ & \leq \eps/2. \end{align*} Therefore, \(\norm{f_n-f}_\infty \lt \eps\) for all \(n\geq K\). This proves that \((f_n)\) converges to \(f\) in \((\mathcal{B}(X),\norm{\cdot}_\infty)\).
The space of continuous functions \(C([a, b])\) on the interval \([a,b]\) with sup-norm is a Banach space.
A continuous function on the interval \([a,b]\) is bounded and thus \(C([a, b])\subset \mathcal{B}([a, b])\). Convergence in \(\mathcal{B}([a, b])\) with sup-norm is uniform convergence. A sequence of continuous functions that converges uniformly on \([a, b]\) does so to a continuous function. Hence, Theorem 9.2.13 implies that \(C([a, b])\) is a closed subset of the complete metric space \(\mathcal{B}([a, b])\) and then Theorem 9.4.2 finishes the proof.
Prove that \(\ell_\infty\) and \(\ell_1\) are complete and hence Banach spaces.
In a Banach space, convergence of series can be decided entirely from the convergence of real series.
Let \((V,\norm{\cdot})\) be a Banach space and let \((z_n)\) be a sequence in \(V\). If the real series \(\sum_{n=1}^\infty \norm{z_n}\) converges then the series \(\sum_{n=1}^\infty z_n\) converges in \(V\).
Suppose that \(\sum_{n=1}^\infty \norm{z_n}\) converges, that is, suppose that the sequence of partial sums \(t_n = \sum_{k=1}^n \norm{z_n}\) converges (note that \((t_n)\) is increasing). Then \((t_n)\) is a Cauchy sequence. Consider the sequence of partial sums \(s_n = \sum_{k=1}^n z_k\). For \(n \gt m\) we have \begin{align*} \norm{s_n-s_m} &= \norm{\sum_{k=m+1}^n z_n} \\ &\leq \sum_{k=m+1}^n \norm{z_n} \\ &= t_n - t_m \\ &= |t_n-t_m| \end{align*} and since \((t_n)\) is Cauchy then \(|t_n - t_m|\) can be made arbitrarily small provided \(n,m\) are sufficiently large. This proves that \((s_n)\) is a Cauchy sequence in \(V\) and therefore converges since \(V\) is complete.
We make two remarks. The converse of Theorem 9.4.14 is also true, that is, if every series \(\sum z_n\) in \((V,\norm{\cdot})\) converges whenever \(\sum \norm{z_n}\) converges in \(\real\) then \(V\) is a Banach space. Notice that the proof of Theorem 9.4.14 is essentially the same as the proof of the Weierstrass \(M\)-test.
Consider the set of matrices \(\real^{n\times n}\) equipped with the 2-norm \[ \norm{{A}}_2 = \left(\sum_{i,j=1}^n a_{i,j}^2\right)^{1/2} \] The norm \(\norm{{A}}_2\) is called the Frobenius norm or the Hilbert-Schmidt norm.
  1. Prove that \((\mat{n},\norm{\cdot}_2)\) is complete.
  2. Use the Cauchy-Schwarz inequality \[ \left(\sum_{\ell=1}^N x_\ell y_\ell\right)^2 \leq \left(\sum_{\ell=1}^N x_\ell^2\right) \left(\sum_{\ell=1}^N y_\ell^2\right) \] to prove that \(\norm{{A}{B}}_2 \leq \norm{{A}}_2\norm{{B}}_2\). Conclude that \(\norm{{A}^k}_2 \leq (\norm{{A}}_2)^k\) for all \(k\in\N\).
  3. Let \(f(x) = \sum_{k=1}^\infty c_k x^k\) be a power series converging on \(\real\). Define the function \(f:\mat{n}\rightarrow\mat{n}\) as \[ f({A}) = \sum_{k=1}^\infty c_k {A}^k. \] Prove that \(f\) is well-defined and that if \(c_k\geq 0\) then \(\norm{f({A})}_2 \leq f(\norm{{A}}_2)\), that is, that \[ \norm{\sum_{k=1}^\infty c_k {A}^k}_2 \leq \sum_{k=1}^\infty c_k\norm{{A}}_2^k \]
  1. The norm \(\norm{A}_2\) is simply the standard Euclidean norm on \((\real^{N},\norm{\cdot}_2)\) with \(N=n^2\) and identifying matrices as elements of \(\real^{N}\). Hence, \((\real^{n\times n},\norm{\cdot}_2)\) is complete.
  2. From the Cauchy-Schwarz inequality we have \begin{align*} ({A}{B})_{i,j}^2 &= \left(\sum_{\ell=1}^n a_{i,\ell}b_{\ell,j}\right)^2\\ & \leq \left(\sum_{\ell=1}^n a_{i,\ell}^2\right) \left(\sum_{\ell=1}^n b_{\ell,j}^2\right) \end{align*} and therefore \begin{align*} \norm{{A}{B}}_2 &= \left(\sum_{1\leq i,j\leq n} ({A}{B})_{i,j}^2\right)^{1/2}\\[2ex] &\leq \left(\sum_{1\leq i,j\leq n} \left(\sum_{\ell=1}^n a_{i,\ell}^2\right) \left(\sum_{\ell=1}^n b_{\ell,j}^2\right) \right)^{1/2}\\[2ex] &= \left(\sum_{i,\ell=1}^n a_{i,\ell}^2\right)^{1/2} \left(\sum_{j,\ell=1}^n b_{\ell,j}^2\right)^{1/2}\\[2ex] &= \norm{{A}}_2 \norm{{B}}_2 \end{align*}
  3. We first note that for any power series \(\sum_{k=1}^\infty a_k x^k\) that converges in \((-R,R)\), the power series \(\sum_{k=1}^\infty |a_k| x^k\) also converges in \((-R,R)\). The normed space \((\mat{n},\norm{\cdot}_2)\) is complete and thus \(\sum_{k=1}^\infty c_k {A}^k\) converges whenever \(\sum_{k=1}^\infty \norm{c_k {A}^k}_2\) converges. Now by part (b), \(\norm{c_k {A}^k}_2 = |c_k| \norm{{A}^k}_2\leq |c_k| \norm{{A}}_2^k\) and since \(\sum_{k=1}^\infty |c_k| \norm{{A}}_2^k\) converges then by the comparison test for series in \(\real\), the series \(\sum_{k=1}^\infty \norm{c_k {A}^k}_2\) converges. Therefore, \(f({A})\) is well-defined by Theorem 9.4.14. To prove the last inequality, we note that the norm function on a vector space is continuous and thus if \(c_k\geq 0\) then \begin{align*} \norm{\sum_{k=1}^m c_k{A}^k}_2 &\leq \sum_{k=1}^m |c_k| \norm{{A}}_2^k \\ &\leq \sum_{k=1}^m c_k \norm{{A}}_2^k \end{align*} and therefore \begin{align*} \norm{\sum_{k=1}^\infty c_k{A}^k}_2 &= \lim_{m\rightarrow\infty} \norm{\sum_{k=1}^m c_k{A}^k}_2\\ & \leq \lim_{m\rightarrow\infty} \sum_{k=1}^m c_k \norm{{A}}_2^k\\ & = \sum_{k=1}^\infty c_k \norm{{A}}_2^k, \end{align*} in other words, \(\norm{f({A})}_2 \leq f(\norm{{A}}_2)\).
In view of the previous example, we can define for \({A}\in\mat{n}\) the following: \begin{align*} e^{{A}} &= \sum_{k=0}^\infty \frac{1}{k!} {A}^k \\ \sin({A}) &= \sum_{k=0}^\infty \frac{(-1)^n}{(2n+1)!} {A}^{2n+1}\\ \cos({A}) &= \sum_{k=0}^\infty \frac{(-1)^n}{(2n)!} {A}^{2n}\\ \arctan({A}) &= \sum_{k=0}^\infty \frac{(-1)^n}{(2n+1)}{A}^{2n+1} \end{align*} and for instance \(\norm{e^{{A}}}_2 \leq e^{\norm{{A}}_2}\), etc.

Exercises

Let \((M_1,d_1)\) and \((M_2,d_2)\) be metric spaces. There are several ways to define a metric on the Cartesian product \(M_1\times M_2\). One way is to imitate what was done in \(\real^2=\real\times\real\). We can define \(d:M_1\times M_2\rightarrow [0,\infty)\) as \[ d((x,u), (y,v)) = \sqrt{ d_1(x,y)^2 + d_2(u,v)^2} \]
  1. Prove that \(d\) is a metric on \(M_1\times M_2\).
  2. Prove that \(((x_n,u_n))_{n=1}^\infty\) converges in \(M_1\times M_2\) if and only if \((x_n)_{n=1}^\infty\) and \((y_n)_{n=1}^\infty\) converge in \(M_1\) and \(M_2\), respectively. (Hint: Theorem 9.2.7)
  3. Prove that \(M_1\times M_2\) is complete if and only if \(M_1\) and \(M_2\) are complete. (Hint: Corollary 9.2.8)
Let \(M\) be a complete metric space and let \((z_n)\) be a sequence in \(M\) such that \(d(z_n, z_{n+1}) \lt r^n\) for all \(n\in \N\) for some fixed \(0 \lt r \lt 1\). Prove that \((z_n)\) converges. (See Exercise 3.6.6.)
Let \(\sum_{n=1}^\infty x_n\) be a convergent series in a normed vector space \((V,\norm{\cdot})\) and suppose that \(\sum_{n=1}^\infty \norm{x_n}\) converges. Show that \[ \left\| \sum_{n=1}^\infty x_n \right\| \leq \sum_{n=1}^\infty \|x_n\|. \] Note: The \(\triangle\)-inequality can only be used on a finite sum. (See Exercise 9.3.2.)
Consider the normed space \((\mathcal{B}(X), \|\cdot\|_\infty)\) where \(X\) is a non-empty set. Let \(\mathcal{K}(X)\) be the set of constant functions on \(X\). Prove that \((\mathcal{K}(X), \|\cdot\|_\infty)\) is a Banach space. (Hint: Theorem 9.4.2)

Compactness

Important results about continuous functions, such as the Extreme Value Theorem (Theorem 5.3.7) and uniform continuity (Theorem 5.4.7), depended heavily on the domain being a closed and bounded interval. On a bounded interval, any sequence \((x_n)\) contains a Cauchy subsequence \((x_{n_k})\) (use the bisection algorithm), and if the interval is also closed then we are guaranteed that the limit of \((x_{n_k})\) is contained in the interval. We have already seen that a totally bounded subset \(E\) of a metric space \(M\) contains a Cauchy sequence (Lemma 9.4.10) and thus if \(E\) is complete then Cauchy sequences converge in \(E\). This motivates the following definition.
A metric space \(M\) is called compact if \(M\) is totally bounded and complete.
A closed and bounded subset \(E\) of \(\real\) is compact. Indeed, \(E\) is complete because it is closed (Theorem 9.4.2) and it is easy to see how to cover \(E\) with a finite number of open intervals of any given radius \(\eps \gt 0\). Conversely, if \(E\subset\real\) is compact then \(E\) is bounded and \(E\) is closed by Theorem 9.4.2. A similar argument shows that \(E\subset\real^n\) is compact if and only if \(E\) is closed and bounded. This is called the Heine-Borel theorem.
A subset \(E\subset\real^n\) is compact if and only if \(E\) is closed and bounded.
The unit \(n\)-sphere \(\mathbb{S}^{n}\) in \(\real^{n+1}\) is the set \[ \mathbb{S}^{n} = \{ x=(x_1,x_2,\ldots,x_n, x_{n+1})\in\real^n \;|\; x_1^2+x_2^2+ \cdots + x_{n+1}^2 = 1\}. \] Explain why \(\mathbb{S}^{n}\) is compact subset of \(\real^{n+1}\).
Consider \(\real^{n\times n}\) with norm \(\norm{A}_2 = \left(\sum_{i,j} a_{i,j}^2\right)^{1/2}\). Then \(\real^{n\times n}\) is complete. A matrix \(Q\in\real^{n\times n}\) is called orthogonal if \(Q^T Q = I\), where \(Q^T\) denotes the transpose of \(Q\) and \(I\) is the identity matrix. Prove that the set of orthogonal matrices, which we denote by \(O(n)\), is compact.
A useful characterization of compactness is stated in the language of sequences.
A metric space \(M\) is compact if and only if every sequence in \(M\) has a convergent subsequence.
Assume that \(M\) is compact. If \((x_n)\) is a sequence in \(M\) then \(\{x_n\;|\; n\in\N\}\) is totally bounded and thus has a Cauchy subsequence which converges by completeness of \(M\). Conversely, assume that every sequence in \(M\) has a convergent subsequence. If \((x_n)\) is a Cauchy sequence then by assumption it has a convergent subsequence and thus \((x_n)\) converges. This proves \(M\) is complete. Suppose that \(M\) is not totally bounded. Then there exists \(\eps \gt 0\) such that \(M\) cannot be covered by a finite number of open balls of radius \(\eps \gt 0\). Hence, there exists \(x_1, x_2\in M\) such that \(d(x_1,x_2)\geq\eps\). By induction, suppose \(x_1,\ldots, x_k\) are such that \(d(x_i, x_j)\geq \eps\) for \(i\neq j\). Then there exists \(x_{k+1}\) such that \(d(x_i,x_{k+1})\geq \eps\) for all \(i=1,\ldots,k\). By induction then, there exists a sequence \((x_n)\) such that \(d(x_i,x_j)\geq \eps\) if \(i\neq j\). Clearly, \((x_n)\) is not a Cauchy sequence and therefore cannot have a convergent subsequence.
Let \(M\) be a metric space.
  1. Prove that if \(E\subset M\) is finite then \(E\) is compact.
  2. Is the same true if \(E\) is countable?
  3. What if \(M\) is compact?
We now describe how compact sets behave under continuous functions.
Let \(f:M_1\rightarrow M_2\) be a continuous mapping. If \(E\subset M_1\) is compact then \(f(E)\subset M_2\) is compact.
We use the sequential criterion for compactness. Let \(y_n=f(x_n)\) be a sequence in \(f(E)\). Since \(E\) is compact, by Theorem 9.5.5, there is a convergent subsequence \((x_{n_k})\) of \((x_n)\). By continuity of \(f\), the subsequence \(y_{n_k}=f(x_{n_k})\) of \((y_n)\) is convergent. Hence, \(f(E)\) is compact.
We now prove a generalization of the Extreme value theorem 5.3.7.
Let \((M,d)\) be a compact metric space. If \(f:M\rightarrow\real\) is continuous then \(f\) achieves a maximum and a minimum on \(M\), that is, there exists \(x^*, x_*\in M\) such that \(f(x_*)\leq f(x) \leq f(x^*)\) for all \(x\in M\). In particular, \(f\) is bounded.
By Theorem 9.5.7, \(f(M)\) is a compact subset of \(\real\) and thus \(f(M)\) is closed and bounded. Let \(y_*=\inf f(M)\) and let \(y^* = \sup f(M)\). By the properties of the supremum, there exists a sequence \((y_n)\) in \(f(M)\) converging to \(y^*\). Since \(f(M)\) is closed, then \(y^*\in f(M)\) and thus \(y^* = f(x^*)\) for some \(x^*\in M\). A similar argument shows that \(y_*=f(x_*)\) for some \(x_*\in M\). Hence, \(\inf f(M) = f(x_*)\leq f(x) \leq f(x^*) = \sup f(M)\) for all \(x\in M\).
Let \(M\) be a metric space and let \(E\subset M\). A cover of \(E\) is a collection \(\{U_i\}_{i\in I}\) of subsets of \(M\) whose union contains \(E\). The index set \(I\) may be countable or uncountable. The cover \(\{U_i\}_{i\in I}\) is called an open cover if each set \(U_i\) is open. A subcover of a cover \(\{U_i\}\) of \(E\) is a cover \(\{U_j\}_{j\in J}\) of \(E\) such that \(J\subset I\).
A metric space \(M\) is compact if and only if every open cover of \(M\) has a finite subcover.
Assume that \(M\) is compact. Then by Theorem 9.5.5, every sequence in \(M\) has a convergent subsequence. Let \(\{U_i\}\) be an open cover of \(M\). We claim there exists \(\eps \gt 0\) such that for each \(x\in M\) it holds that \(B_\eps(x)\subset U_i\) for some \(U_i\). If not, then then for each \(n\in \N\) there exists \(x_n\in M\) such that \(B_{1/n}(x_n)\) is not properly contained in a single set \(U_i\). By assumption, the sequence \((x_n)\) has a converging subsequence, say it is \((x_{n_k})\) and \(y=\lim x_{n_k}\). Hence, for each \(k\in \N\), \(B_{1/n_k}(x_{n_k})\) is not properly contained in a single \(U_i\). Now, \(y\in U_j\) for some \(j\), and thus since \(U_j\) is open there exists \(\delta \gt 0\) such that \(B_\delta(y)\subset U_j\). Since \((x_{n_k})\rightarrow y\), there exists \(K\) sufficiently large such that \(d(x_{n_K}, y) \lt \delta/2\) and \(\frac{1}{n_K} \lt \delta/2\). Then \(B_{1/n_K}(x_{n_K}) \subset U_j\) which is a contradiction. This proves that such an \(\eps \gt 0\) exists. Now since \(M\) is totally bounded, there exists \(z_1,z_2,\ldots,z_p\in M\) such that \(B_\eps(z_1)\cup\cdots\cup B_\eps(z_p) = M\), and since \(B_\eps(z_j) \subset U_{i_j}\) for some \(U_{i_j}\) it follows that \(\{U_{i_1},U_{i_2},\ldots,U_{i_p}\}\) is a finite subcover of \(\{U_i\}\). For the converse, we prove the contrapositive. Suppose then that \(M\) is not compact. Then by Theorem 9.5.5, there is a sequence \((x_n)\) in \(M\) with no convergent subsequence. In particular, there is a subsequence \((y_k)\) of \((x_n)\) such that all \(y_k\)'s are distinct and \((y_k)\) has no convergent subsequence. Then there exists \(\eps_i \gt 0\) such that \(B_{\eps_i}(y_i)\) contains only the point \(y_i\) from the sequence \((y_k)\), otherwise we can construct a subsequence of \((y_k)\) that converges. Hence, \(\{B_{\eps_k}(y_k)\}_{k\in \N}\) is an open cover of the set \(E=\{y_1,y_2,y_3,\ldots,\}\) that has no finite subcover. The set \(E\) is clearly closed since it consists entirely of isolated points of \(M\). Hence, \(\{B_{\eps_k}(y_k)\}_{k\in \N}\cup M\backslash\hspace{-0.3em} E\) is an open cover of \(M\) with no finite subcover.
A function \(f:(M_1,d_1)\rightarrow (M_2,d_2)\) is uniformly continuous if for any \(\eps \gt 0\) there exists \(\delta \gt 0\) such that if \(d_1(x,y) \lt \delta\) then \(d_2(f(x),f(y)) \lt \eps\).
A function \(f:(M_1,d_1)\rightarrow (M_2,d_2)\) is Lipschitz if there is a constant \(K \gt 0\) such that \(d_2(f(x),f(y))\leq K d(x,y)\). Show that a Lipschitz map is uniformly continuous.
If \(f:M_1\rightarrow M_2\) is uniformly continuous and \((x_n)\) is a Cauchy sequence in \(M_1\), prove that \((f(x_n))\) is a Cauchy sequence in \(M_2\).
If \(f:(M_1,d_1)\rightarrow (M_2,d_2)\) is continuous and \(M_1\) is compact then \(f\) is uniformly continuous.
Let \(\eps \gt 0\). For each \(x\in M_1\), there exists \(r_x \gt 0\) such that if \(y\in B_{r_x}(x)\) then \(f(y) \in B_{\eps/2}(f(x))\). Now \(\{B_{r_x/2}(x)\}_{x\in M_1}\) is an open cover of \(M_1\) and since \(M_1\) is compact there exists finite \(x_1,x_2,\ldots,x_N\) such that \(\{B_{\delta_i}(x_i)\}_{i=1}^N\) is an open cover of \(M_1\), where we have set \(\delta_i = r_{x_i}/2\). Let \(\delta=\min\{\delta_1,\ldots,\delta_N\}\). If \(d_1(x,y) \lt \delta\), and say \(x\in B_{\delta_i}(x_i)\), then \(d_1(y,x_i)\leq d_1(y,x) + d_1(x,x_i) \lt \delta + \delta_i \lt r_{x_i}\) and thus \begin{align*} d_2(f(x),f(y)) &\leq d_2(f(x),f(x_i)) + d_2(f(x_i),f(y))\\ & \lt \eps/2 + \eps/2 \\ &= \eps. \end{align*} This proves that \(f\) is uniformly continuous.

Exercises

Prove that if \(E\subset\real\) is compact then \(\sup(E)\) and \(\inf(E)\) are elements of \(E\).
Recall that \(\ell_\infty\) is the set of sequences in \(\real\) that are bounded and equipped with the norm \(\|(x_n)\|_\infty = \sup_{n\in\mathbb{N}} |x_n|\). Show that the unit ball \(B = \{(x_n)\,:\, \|(x_n)\|_\infty \leq 1\}\) (which is clearly bounded) is not compact in \(\ell_\infty\). (see Example 9.4.7)
Let \((x_n)\) be a sequence in a metric space \(M\) and suppose that \((x_n)\) converges to \(p\). Prove that \(S = \{p\}\cup \{x_n\,|\, n\in\mathbb{N}\}\) is a compact subset of \(M\).
Prove that if \(M\) is compact then there exists a countable subset \(E\subset M\) that is dense in \(M\).
Let \(E\) be a compact subset of \(M\) and fix \(p\in M\). Prove that there exists \(z\in E\) such that \(d(z, p)\leq d(x, p)\) for all \(x\in E\).

Fourier Series

Motivated by problems involving the conduction of heat in solids and the motion of waves, a major problem that spurred the development of modern analysis (and mathematics in general) was whether an arbitrary function \(f\) can be represented by a series of the form \[ \frac{a_0}{2} + \sum_{n=1}^\infty(a_n\cos(nx) + b_n \sin(nx)) \] for appropriately chosen coefficients \(a_n\) and \(b_n\). A central character in this development was mathematician and physicist Joseph Fourier and for this reason such a series is now known as a Fourier series. Fourier made the bold claim (Theorie analytique de la Chaleur, 1822) that there is no function \(f(x)\) or part of a function, which cannot be expressed by a trigonometric series. Fourier's claim led B. Riemann (1854) to develop what we now call the Riemann integral. After Riemann, Cantor's (1872) interest in trigonometric series led him to the investigation of the derived set of a set \(S\) (which nowadays we call the limit points of \(S\)) and he subsequently developed set theory. The general problem of convergence of a Fourier series led to the realization that by allowing arbitrary functions into the picture the theory of integration developed by Riemann would have to be extended to widened the class of integrable functions. This extension of the Riemann integral was done by Henri Lebesgue (1902) and spurred the development of the theory of measure and integration. The Lebesgue integral is widely accepted as the official integral of modern analysis. In this section, our aim is to present a brief account of Fourier series with the tools that we have already developed. To begin, suppose that \(f:[-\pi,\pi]\rightarrow\real\) is Riemann integrable and can be represented by a Fourier series, that is, \begin{equation}\label{eqn:fourier-series} f(x) = \frac{a_0}{2} + \sum_{n=1}^\infty(a_n\cos(nx) + b_n \sin(nx)) \end{equation} for \(x\in[-\pi,\pi]\). In other words, the series on the right of \eqref{eqn:fourier-series} converges pointwise to \(f\) on \([-\pi,\pi]\). The first question we need to answer is what are the coefficients \(a_n\) and \(b_n\) in terms of \(f\)? To that end, we use the following facts. Let \(n,m\in\N\):
  1. For all \(n\): \[ \int_{-\pi}^{\pi} \sin(nx)dx = \int_{-\pi}^{\pi} \cos(nx)dx = 0. \]
  2. If \(n\neq m\) then \[ \int_{-\pi}^{\pi} \sin(nx)\sin(mx)dx = \int_{-\pi}^{\pi} \cos(nx)\cos(mx)dx = 0. \]
  3. For all \(n\) and \(m\): \[ \int_{-\pi}^{\pi} \sin(nx)\cos(mx)dx = 0. \]
  4. For all \(n\) and \(m\): \[ \int_{-\pi}^{\pi} \sin^2(nx)dx = \int_{-\pi}^{\pi} \cos^2(nx)dx = \pi. \]
Then, using these facts, and momentarily ignoring the interchange of the integral and infinite sum, \begin{align*} \int_{-\pi}^\pi f(x) \cos(nx)dx &= \int_{-\pi}^\pi \Big[ \frac{a_0}{2}\cos(nx)\\ &  + \sum_{k=1}^\infty( a_k \cos(kx)\cos(nx) + b_k \sin(kx)\cos(nx))\Big]dx \\[2ex] &= \frac{a_0}{2}\int_{-\pi}^\pi\cos(nx)dx + \sum_{k=1}^\infty a_k \int_{-\pi}^\pi \cos(kx)\cos(nx)dx \\ &\qquad+ \sum_{k=1}^\infty b_k \int_{-\pi}^\pi\sin(kx)\cos(nx)dx\\ &= a_n \pi. \end{align*} Therefore, \[ a_n = \frac{1}{\pi} \int_{-\pi}^\pi f(x)\cos(nx)dx. \] A similar calculation shows that \[ b_n = \frac{1}{\pi} \int_{-\pi}^\pi f(x)\sin(nx)dx. \] Finally, \begin{align*} \int_{-\pi}^\pi f(x)dx &= \frac{a_0}{2}\int_{-\pi}^\pi dx + \sum_{k=1}^\infty a_k \int_{-\pi}^\pi \cos(kx)dx + \sum_{k=1}^\infty b_k \int_{-\pi}^\pi\sin(kx)dx\\ & = \frac{a_0}{2}2\pi\\ & = a_0 \pi \end{align*} and therefore \[ a_0 = \frac{1}{\pi} \int_{-\pi}^\pi f(x) dx. \] Of course, the above calculations are valid provided that the Fourier series converges uniformly to \(f\) on \([-\pi,\pi]\) since if all we have is pointwise convergence then in general we cannot interchange the integral sign and the infinite sum. Since the functions \(f_n(x)=a_n\cos(nx)+b_n\sin(nx)\) in the Fourier series are clearly continuous, and if we insist that the convergence is uniform, then we have restricted our investigation of Fourier series to continuous functions! Relaxing this restriction led to the development of what we now call the Lebesgue integral; Lebesgue was interested in extending the notion of integration beyond Riemann's development so that a wider class of functions could be integrated and, more importantly, this new integral would be more robust when it came to exchanging limits with integration, i.e., uniform convergence would not be needed. A full development of Lebesgue's theory of integration is beyond the scope of this book, however, we can still say some interesting things about Fourier series. Motivated by our calculations above, suppose that \(f\in C[-\pi,\pi]\) and define \begin{align*} a_0 &= \frac{1}{\pi} \int_{-\pi}^\pi f(x) dx\\ a_n &= \frac{1}{\pi} \int_{-\pi}^\pi f(x)\cos(nx)dx\\ b_n &= \frac{1}{\pi} \int_{-\pi}^\pi f(x)\sin(nx)dx. \end{align*} Assume that the Fourier series of \(f\) converges uniformly on \(C[-\pi,\pi]\) and let \[ g(x) = \frac{a_0}{2} + \sum_{n=1}^\infty(a_n\cos(nx) + b_n \sin(nx)). \] Then \(g\) is continuous on \([-\pi,\pi]\). Does \(f=g\)? To answer this question, our computations above show that \[ \frac{1}{\pi} \int_{-\pi}^\pi g(x)\cos(nx) dx = \frac{1}{\pi} \int_{-\pi}^\pi f(x)\cos(nx) dx \] and therefore \[ \int_{-\pi}^\pi [f(x)-g(x)]\cos(nx)dx = 0 \] for all \(n\in\N\cup\{0\}\). Similarly, for all \(n\in\N\cup\{0\}\) we have \[ \int_{-\pi}^\pi [f(x)-g(x)]\sin(nx)dx = 0. \] Let \(s_n(x) = \frac{a_0}{2} + \sum_{k=1}^n a_k\cos(kx)+b_k\sin(kx)\) and recall that \((s_n)\) converges uniformly to \(g\). Consider for the moment \[ \int_{-\pi}^{\pi} [f(x)-s_n(x)]^2dx = \int_{-\pi}^\pi f(x)^2 dx - 2 \int_{-\pi}^{\pi} f(x) s_n(x) dx + \int_{-\pi}^{\pi} s_n^2(x)dx. \] A straightforward computation shows that \[ \int_{-\pi}^{\pi} f(x) s_n(x) dx = \pi \left[\frac{a_0^2}{2} + \sum_{k=1}^n (a_k^2 + b_k^2)\right] \] and \[ \int_{-\pi}^{\pi} s_n^2(x)dx = \pi \left[\frac{a_0^2}{2} + \sum_{k=1}^n (a_k^2 + b_k^2)\right]. \] Therefore, \[ \frac{1}{\pi}\int_{-\pi}^{\pi} [f(x)-s_n(x)]^2dx = \frac{1}{\pi}\int_{-\pi}^\pi f(x)^2 dx - \frac{1}{\pi}\int_{-\pi}^{\pi} s_n^2(x)dx \] Now since \(\int_{-\pi}^{\pi} [f(x)-s_n(x)]^2dx \geq 0\) it follows that \[ \frac{1}{\pi}\int_{-\pi}^{\pi} s_n^2(x)dx \leq \frac{1}{\pi}\int_{-\pi}^\pi f(x)^2 dx, \] or equivalently that \[ \frac{a_0^2}{2} + \sum_{k=1}^n (a_k^2 + b_k^2) \leq \frac{1}{\pi}\int_{-\pi}^\pi f(x)^2 dx. \] This proves that the series \(\frac{a_0^2}{2} + \sum_{k=1}^\infty (a_k^2 + b_k^2)\) converges and \[ \frac{a_0^2}{2} + \sum_{k=1}^\infty (a_k^2 + b_k^2) \leq \frac{1}{\pi}\int_{-\pi}^\pi f(x)^2 dx. \] In particular, \(\lim_{k\rightarrow\infty} (a_k^2+b_k^2) = 0\) and thus \(\lim_{k\rightarrow\infty} a_k = \lim_{k\rightarrow\infty} b_k = 0\).