The Number Theory and Topics in Analysis courses complement each other surprisingly well on the topic of continued fractions. In Number Theory, they were introduced ultimately as a means to an end - a technique for solving Pell's equation, and a prime factorisation algorithm. The TinA approach was vastly more contemplative, presenting continued fractions as a (slightly feeble) attempt to understand some aspects of the nature of real numbers.
The rational numbers are dense in the real numbers. Yet there are lots more real numbers than rational numbers. In order to make any sense at all out of this, we need to be less careless. One possible approach is to quantify the
cost or
effectiveness of a given rational approximation, by measuring the size of the denominator you have to resort to in order to make it happen (compare this to the degree of the polynomials used to uniformly approximate continuous functions).
The results of this idea are ultimately quite surprising. We obtain a hierarchy of types of real number, based on how costly they are to approximate. The real numbers that are hardest to approximate are what we intuitively consider to be the most regular class of reals - the rational numbers themselves. What?
|
Class of real numbers |
Measure (per unit) |
Cardinality |
Category |
Cost of approximating |
Algebraic numbers |
Rationals |
0 |
$\aleph_0$ |
Meagre |
Disastrous |
Irrational algebraics |
0 |
$\aleph_0$ |
Meagre |
Meh |
Transcendental numbers |
Diophantine transcendentals |
1 |
$\mathfrak{c}$ |
Meagre |
Non-Diophantine transcendentals |
0 |
$\mathfrak{c}$ |
Residual |
Good |
The aim of this post is to establish the above results, as well as providing a more general panorama of continued fractions, their properties, and their applications. I will include results and concepts from both courses, so hopefully anybody reading this will enjoy and benefit from a slightly expanded perspective.
Section 0: A word on decimal expansions
We will show later that continued fractions correspond bijectively with real numbers. So, in some sense, they are an alternative way of representing reals, as opposed to the more familiar decimal expansion. The analogy pretty much stops there - as far as I can tell, there is little to gain in comparing the two systems. Still, I want to seize this opportunity to consider formally a couple of properties of decimals that are usually taken for granted, as we will briefly explore some analagous results for continued fractions.
Proposition 0.1 A real number with a repeating decimal expansion is a rational number.
Proof: Let $\alpha = d.d_1d_2d_3\dots d_m\overline{d_{m+1}\dots d_{m+k}}$. Then $10^m\alpha = N + 0.\overline{d_{m+1}\dots d_{m+k}}$, and $10^{m+k}\alpha=M + 0.\overline{d_{m+1}\dots d_{m+k}}$ for some integers $N$, $M$. Thus, $10^{m+k}\alpha - 10^m\alpha = K$ for some integer $K$, giving $\alpha = \frac{K}{10^{m+k}-10^k}$, and so $\alpha \in \mathbb{Q}$. $\square$ (
proof found here).
Proposition 0.2 A rational number has a repeating decimal expansion.
Proof: Consider performing the calculation $\frac{p}{q}$ by long division. At each step, the remainder will be in $\{0,1,2,\ldots,q-1\}$. By the pigeonhole principle, a remainder must repeat after $q+1$ steps. But a repeated remainder forces the calculation into a loop. $\square$
Thus, real numbers with repeating decimal expansions correspond to a particular subclass of numbers - the rationals. We will later remark that real numbers with repeating continued fractions also correspond to a interesting subclass - quadratic algebraic numbers.
Section 1: Diophantine approximation
In this section, we consider the problem of finding a rational number close to an arbitrary real number $\alpha$.
As explained in the introduction, we are mainly interested in obtaining lower bounds for the cost. But no lower bound is complete without an upper bound. The following theorem is the much-needed reality check that I'm very surprised didn't come up in TinA - for any real number, the cost is never going to get pathologically bad.
Theorem 1.1 (Dirichlet's theorem) Let $\alpha$ be a real number. For every $N \in \mathbb{N}$ there exists a rational number $\frac{a}{q}$ with $1 \leq q < N$ and $\left| \alpha - \frac{a}{q}\right| \leq \frac{1}{qN}$.
Proof: Consider the fractional part of the $N+1$ distinct numbers $k\theta$, for $0 \leq k \leq N$. Split the unit interval into the $N$ disjoint intervals of length $\frac{1}{N}$ given by $[\frac{j}{N},\frac{j+1}{N}]$, for $0 \leq j < N$.
By the pigeonhole principle, there must be an inteval containing two of the $k\theta$, say $k_1\theta$ and $k_2\theta$, $k_2>k_1$. But then for $q=k_2-k_1$, $|q \theta - a | \leq \frac{1}{N}$ for some integer $a$. By construction, $1\leq q \leq N$. $\square$
Still, let's not get too optimistic about this. With the obvious caveat that any fraction "approximates" its own equivalence class perfectly, rational numbers are awful at approximating other rationals.
Proposition 1.2 If $\alpha = \frac{a}{b} \neq \frac{p}{q}$, then $\left|\alpha - \frac{p}{q}\right| \geq \frac{1}{bq}$.
Proof: $\left|\frac{a}{b}-\frac{p}{q}\right| = \left|\frac{aq-bp}{bq}\right| \geq \frac{1}{bq}$ $\square$
Algebraic numbers are slightly better, but still fairly poor.
Theorem 1.3 (Liouville's theorem) Let $\alpha$ be algebraic. Then $\exists c>0$ (depending on $\alpha$) such that $\left|\alpha - \frac{p}{q}\right| \geq \frac{c}{q^n}$, for every rational $\frac{p}{q}$, where $n$ is the degree of the minimal polynomial over $\mathbb{Z}$.
Proof: In lectures we only assumed that $f$ is an arbitrary polynomial with $\alpha$ as a root. However, using the minimal polynomial gives a better lower bound. Pick $R$ sufficiently large that all of the (real) roots of $f$ are contained inside $[-R+1,R-1]$. $f^\prime$ is continuous on $[-R,R]$, and so bounded, by $M$, say. For every rational $\frac{p}{q} \in [-R,R]$, if $f(\frac{p}{q}) \neq 0$, then $\frac{1}{q^n} \leq \left|f(\frac{p}{q})\right|= \left|f(\frac{p}{q})-f(\alpha)\right| \leq M \left| \frac{p}{q} -\alpha\right|$, by the mean value theorem.
There are only many finitely roots, so $\exists c_0$ such that $|\frac{p}{q}-\alpha| \geq c_0 \geq \frac{c_0}{q^n}$ for all rational roots $\frac{p}{q}$ of $f$ (which we secretly know have to be integers anyway).
Finally, if $\left|\frac{p}{q}\right| \geq R$, then $\left|\frac{p}{q} - \alpha \right| \geq 1 \geq \frac{1}{q^n}$.
So, by taking $c = \min\{\frac{1}{M},1,c_0\}$, we have found the required constant. $\square$
Consider now a new class of real numbers, the diophantine numbers, which we didn't discuss directly in lectures. In order to motivate the definition, recall
Roth's theorem, which was stated (incorrectly!) in lectures as an improvement on Liouville's theorem.
Theorem 1.4 (Thue-Siegel-Roth theorem) Let $\alpha$ be algebraic and irrational. Then for all $\varepsilon > 0$, $\exists c > 0$ such that $\left| \alpha - \frac{p}{q}\right| \geq \frac{c}{q^{2+\varepsilon}}$ for any rational number $\frac{p}{q}$.
(In lectures, this was stated as $\forall \varepsilon$, $\exists c$ such that $\forall \alpha \ldots$, which is incorrect - the constant $c$ depends on $\alpha$.)
Definition. Say that $\alpha \in \mathbb{R}$ is Diophantine if $\forall \varepsilon > 0$, $\exists c >0$ such that $\left| \alpha - \frac{p}{q}\right| \geq \frac{c}{q^{2+\varepsilon}}$ for any rational number $\frac{p}{q}$.
In other words, Theorem 1.4 says that every algebraic number is Diophantine. (Note: there may be other names for this class of numbers - they seem to be discussed very little on wiki. My source for this name is
Terry Tao's blog. Not to be confused with Diophantine sets, which
are discussed on wiki, and are related to recursively enumerable sets.)
Theorem 1.5 The set of Diophantine numbers has full measure, but is meagre.
Proof: This is actually an excellent exercice in both measure theory and linear analysis, so I won't reproduce the proof here, only
link to a good exposition of it. $\square$
These results already suffice to establish the table presented in the introduction. Interestingly, we have shown that the set of numbers that you can approximate very well with rationals is small in the sense of measure, but large in the sense of the Baire Category Theorem.
We have also set the stage for a technique for proving transcendentality - if you can prove that a given number violates the Liouville lower bound, or the stronger Roth lower bound, you can deduce that it's transcendental. An example of this is the Liouville number given in lectures, or more generally the uncountable set of Liouville numbers constructed in Example Sheet 4:
Proposition 1.6 For any sequence $a_n$ taking values in $\{1,2\}$, $\sigma = \sum \frac{a_n}{10^{n!}}$ is transcendental.
Open question (that I want to come back to, but haven't had time to think about yet) - are the above Liouville numbers Diophantine?
Section 2: Continued fractions
We will define continued fractions using the continued fraction algorithm, define the convergent sequences $p_n$ and $q_n$ independently, and then proceed to derive the link between the two. This presentation is backwards compared to the one given in TinA, but it avoids all of the tautologies that everyone was complaining about.
Given a real number $\theta$, we obtain a sequence $a_0,a_1,\ldots$ recursively by applying the following algorithm.
Set $a_0=\lfloor \theta \rfloor$.
Define $\alpha_1 = \frac{1}{\theta - a_0}$ (so $\theta = a_0 + \frac{1}{\alpha_1}$).
Define $a_n=\lfloor \alpha_n \rfloor$, and set $\alpha_{n+1}=\frac{1}{\alpha_n - a_n}$.
If we ever encounter $a_n=\alpha_n$, we terminate the process.
We write $[a_0;a_1,a_2,\ldots]$ for the continued fraction with value $\alpha$. The partial continued fractions $[a_0]$, $[a_0;a_1]$, $[a_0;a_1,a_2]$, $\ldots$ are called the convergents of $\alpha$.
By construction, $[a_0;a_1] = a_0 + \frac{1}{a_1}$, $[a_0;a_1,a_2]= \displaystyle a_0 \frac{1}{a_1 + \frac{1}{a_2}}$, etc, so we recover the characteristic tower form of the continued fraction.
Note that $a_n \in \mathbb{N}$, except for $a_0$, which is in $\mathbb{Z}$.
Proposition 2.1 The continued fraction algorithm terminates $\Leftrightarrow$ $\theta$ is rational.
Proof: If the algorithm terminates, then $\theta=[a_0;a_1,\ldots,a_n]$ for some $a_n$. By multiplying out the denominators in the tower form, it is obvious that $\theta$ is rational.
If $\theta$ is rational, then the $\alpha_i$ constructed in the algorithm are also rational. But by construction, their denominators are strictly decreasing, so the process must terminate. $\square$
Obviously, the first thing we want to show is that the convergents converge to $\alpha$ under the Euclidean metric. However, as things currently stand, the partial continued fractions are unwieldy objects. We introduce two intermediate sequences which are easier to manipulate, and which we will show contain all of the information given by the convergents.
Define the sequences:
$p_0 = a_0$ |
$p_1=a_0a_1+1$ |
$p_n=a_np_{n-1}+p_{n-2}$ |
$q_0 = 1$ |
$q_1=a_1$ |
$q_n=a_nq_{n-1}+q_{n-2}$ |
[Note to self: investigate link with Euclidean division algorithm.]
Here are some elementary properties of this sequence, which we will use later on.
Proposition 2.2
- For each $n \geq 1$, $p_nq_{n-1}-p_{n-1}q_n=(-1)^{n+1}$.
- $p_n$ and $q_n$ are coprime.
Proof: For the first statement, we induct on $n$.
When $n=1$, $p_1q_0-p_0q_1=a_0a_1+1-a_0a_1=1$.
Suppose the relation holds for $n$, where $n \geq 1$.
Then $p_{n+1}q_n-p_nq_{n+1}=(a_{n+1}p_n+p_{n-1}q_n)q_n - p_n(a_{n+1}q_n+q_{n-1}) = p_{n-1}q_n-p_nq_{n-1}=(-1)^{n+2}$.
The second statement follows directly from the first (think of Bezout). $\square$
Proposition 2.3 $\frac{p_n}{q_n}=[a_0;a_1,\ldots,a_n]$ in its lowest form.
Proof: Let us relax our notation slightly to allow $\alpha_n$, the last term of $[a_0;a_1,\ldots,\alpha_n]$, to be an arbitrary (positive) real number. Under this notation, we can write $[a_0;a_1,\ldots,a_n,\alpha_{n+1}]=[a_0;a_1,\ldots,a_n+\frac{1}{\alpha_{n+1}}]$. This will hopefully make the calculations slightly clearer.
Observe that the claim is true for $n=0,1,2$ (this involves multiplying out the tower form of $[a_0;a_1,\alpha_2]$, and is absolutely trivial).
Assume that the claim is true for $n$.
$$\displaystyle \frac{p_{n+1}}{q_{n+1}} = \frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}} = \frac{p_n + \frac{1}{\alpha_{n+1}}p_{n-1}}{q_n+\frac{1}{\alpha_{n+1}}q_{n-1}}= \frac{(a_np_{n-1}+p_{n-2})+\frac{p_{n-1}}{\alpha_{n+1}}}{(a_nq_{n-1}+q_{n-2})+\frac{q_{n-1}}{\alpha_{n+1}}} = \frac{\left(a_n+\frac{1}{\alpha_{n+1}}\right)p_{n-1}+p_{n-2}}{\left(a_n+\frac{1}{\alpha_{n+1}}\right)q_{n-1}+q_{n-2}}=[a_0;a_1,\ldots,a_n+\frac{1}{\alpha_{n+1}}]=[a_0;a_1,\ldots,a_n,\alpha_{n+1}]$$
We used the induction hypothesis in the second-last equality.
We have already shown above that $p_n$, $q_n$ are coprime, so the fraction is necessarily in its lowest form. $\square$
We now know enough to go ahead and prove that continued fractions converge to their value.
Proposition 2.4 If $[a_0;a_1,a_2,\ldots]$ is the continued fraction obtained from $\theta$ using the division algorithm, then $[a_0;a_1,\ldots,a_n] \to \theta$ as $n \to \infty$.
Proof: Using the relaxed notation from the above proposition, we can write $\theta = [a_0;a_1,\ldots,a_n,\alpha_{n+1}]$, which we know is equal to $\frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}}$. So:
$$\left| \theta - \frac{p_n}{q_n}\right| = \left|\frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}}-\frac{p_n}{q_n}\right| = \frac{|p_nq_{n-1}-p_{n-1}q_n|}{q_n(\alpha_{n+1}q_n+q_{n-1})}=\frac{1}{q_n(\alpha_{n+1}q_n+q_{n-1})}$$
But by considering the recurrence relation, we see that $q_n$ is a strictly increasing sequence of positive integers. The result follows. $\square$
We have now laid the groundwork for our study of Diophantine approximation via continued fractions. We are not just interested in the fact that continued fractions converge to their value, but also in some intriguing features of their convergence. We will show that continued fractions provide the most cost-effective way to converge to any real, and, remarkably, that the continued fraction algorithm necessarily produces
every good approximation - in other words, if a given approximation is particularly cheap, it will naturally arise as one of the convergents defined above.
Firstly, we show that, considered absolutely, the convergence is monotone - every convergent gives a better approximation than the last.
Lemma 2.5 For each $n$, $\frac{1}{q_{n+2}} \leq | q_n \theta - p_n | \leq \frac{1}{q_{n+1}}$. This immediately implies that $|q_n\theta - p_n | \leq |q_{n-1} \theta - p_{n-1}|$, which further implies that $|\theta - \frac{p_n}{q_n}| \leq |\theta - \frac{p_{n-1}}{q_{n-1}}|$.
Proof: The key observation is that $a_{n+1} \leq \alpha_{n+1} \leq a_{n+1} + 1$. This gives $q_{n+1} = a_{n+1}q_n + q_{n-1} \leq \alpha_{n+1}q_n + q_{n-1} \leq (a_{n+1}+1)q_n+q_{n-1} = q_{n+1}+q_n \leq a_{n+2}q_{n+1}+q_n = q_{n+2}$.
Recalling that $|q_n \theta - p_n | = \frac{1}{\alpha_{n+1}q_n+q_{n-1}}$, from the proof of the previous proposition, we are done. $\square$
However, the partial continued fractions oscillate around their limit $\theta$, via the following observation.
Lemma 2.6 For each $n$, $\theta - \frac{p_n}{q_n}$ and $\theta - \frac{p_{n-1}}{q_{n-1}}$ have opposite sign.
Proof: Go back to the proof of the above lemma, and observe that without the mod signs, we obtain $\theta - \frac{p_n}{q_n} = \frac{(-1)^n}{q_n(\alpha_{n+1}q_n+q_{n-1})}$. $\square$
As promised, we now prove that the convergents are the best approximants for a fixed cost.
Theorem 2.7 For any integers $p$ and $q$ such that $0 < q < q_n$, $|q\theta - p| \geq |q_n\theta - p_n|$.
Proof: Let $\begin{pmatrix} u \\ v \end{pmatrix} = \begin{pmatrix} q_{n-1} & p_{n-1} \\ q_n & p_n \end{pmatrix}\begin{pmatrix} p \\ q \end{pmatrix}$. Note that this matrix has determinant $\pm 1$, so is invertible (its inverse has integer coefficients). Therefore, we can write $p = up_{n-1}+vp_n$ and $q = uq_{n-1}+vq_n$ for some integers $u,v$.
So, $|q\theta - p| = |uq_{n-1} \theta + v q_n \theta - u p_{n-1} - vp_n | = |u (q_{n-1} \theta - p_{n-1}) + v(q_n \theta - p_n)|$.
However, we assumed that $0 < q < q_n$, so $0 < uq_{n-1} + vq_n < q_n$. This forces $u$ and $v$ to have opposite signs (if $v=0$ then $\frac{p}{q}=\frac{p_n}{q_n}$). Since $q_{n-1}\theta - p_{n-1}$ and $q_n\theta - p_n$ also have opposite signs by the above lemma, we can write:
$|q\theta - p| = |u(q_{n-1}\theta-p_{n-1})|+|v(q_n\theta - p_n)| \geq |q_{n-1}\theta - p_{n-1}|$. $\square$
And finally, if any approximation is particularly cost-efficient, it must in fact be a convergent.
Corollary 2.8 For any integer $p$ and positive integer $q$, if $|\theta - \frac{p}{q}| < \frac{1}{2q^2}$, then $\frac{p}{q}$ is one of the convergents of $\theta$.
Proof: Assume WLOG that $p$ and $q$ are coprime. Pick $n$ such that $q_n \leq q < q_{n+1}$ (recall the $q_n$ are a strictly increasing sequence of positive integers, and $q_0=1$, so we can do this).
Then $\left| \frac{p}{q} -\frac{p_n}{q_n}\right| \leq \left| \theta - \frac{p}{q}\right| + \left| \theta - \frac{p_n}{q_n}\right| < \frac{1}{2q^2} + \frac{1}{q_n}|q\theta - p| < \frac{1}{2qq_n}+\frac{1}{q_n}\frac{1}{2q} \leq \frac{1}{qq_n}$.
So, $\left|\frac{p}{q}-\frac{p_n}{q_n}\right| = \frac{|pq_n-p_nq|}{qq_n} < \frac{1}{qq_n}$, so $|pq_n-p_nq| < 1$.
We conclude that $pq_n-p_nq=0$, so $\frac{p}{q}=\frac{p_n}{q_n}$. $\square$
The whole point of this section was to show that continued fractions do indeed provide insight about Diophantine approximation. The above result achieves this pretty spectacularly.
Section 3: Examples and applications
In general, computing the continued fraction of an arbitrary real is not very interesting - it is simply a matter of transforming the information we possess about that real (usually in the form of a decimal expansion) into the language of continued fractions.
If our chosen real number is rational, the algorithm terminates - which is hardly interestingly. Conversely, we have shown that if our real number is irrational it doesn't. We might therefore naively expect to have to do an infinite amount of work in order to achieve the full expansion. Tripos questions can be nasty, but hopefully not that nasty (even when PTJ is writing the questions).
However, in lectures, we saw the tantalising example of $\sqrt{2}$. Very quickly, the algorithm developped a loop, and so we could observe that the continued fraction expansion must necessarily repeat. On Example Sheet 4, we are asked to calculate the full expansion of $\sqrt{3}$, and so we might hope that something similiar is going to occur.
This generalises. There is a full characterisation of numbers with repeating continued fractions (much like section 0 gives a full characterisation of numbers with repeating decimal expansion), given by the following theorem.
A quadratic irrational is a root of a quadratic equation with integer coefficients. In the Number Fields course, we show that any such number must be of the form $a+b\sqrt{N}$ for some (square-free) integer $N$, and rationals $a$ and $b$.
Theorem 3.1 (Lagrange) For any real number $\theta$, $\theta$ is a quadratic irrational $\Leftrightarrow$ its continued fraction is (eventually) periodic.
For square roots, we can say more. The following theorem gives an excellent error-detecting failsafe for those awkward times when you find yourself needing to calculate the continued fraction of a square root in real time at the supermarket check-out.
Theorem 3.2 Let $N$ be a natural number that is not a square. Then its continued fraction is of the form $[a_0;\overline{a_1,a_2,\ldots,a_2,a_1,2a_0}]$. Thus, it is both periodic and symmetric, and each period ends on twice the integer part.
The proofs of these theorems were not given in my course. However, I have an idea which might suffice to prove a watered-down version of 3.2. Watch this space.
This checks out with the excruciatingly simple example given in TinA: $\sqrt{2} = [1;\overline{2}]$. For the eager beavers out there, you can check that this is coherent with your calculations for $\sqrt{3}$. (On a Number Theory example sheet, we had to calculate $\sqrt{46}$, which is 12-periodic, and thus somewhat sadistic. Side-note: to find a higher period you have to go up to $\sqrt{94}$. Coincidence? I think not.)
One excellent application for which continued fractions are crucial is Pell's equation.
Let $N$ be a natural number that is not a square. Pell's equation is given by:
$$x^2-Ny^2=1$$
Pell's equation is useful for example in Number Fields, where solutions correspond to units of a given real quadratic field extension (the norm of $x+y \sqrt N$ is defined to be $x^2-Ny^2$).
We can use continued fractions to prove that Pell's equation can always be solved. We won't go through the method fully here, as it largely irrelevant to the TinA course, but let us at least make the following observation.
Proposition 3.3 Any solution to Pell's equation must be a convergent of $\sqrt N$.
Proof: $(x - y \sqrt N )(x + y \sqrt N) = 1$. WLOG, we can assume $x$ and $y$ are both positive.
This gives $x-y\sqrt N=\frac{1}{x+y\sqrt N} < \frac{1}{2y\sqrt N}<\frac{1}{2y}$, so $\left |\frac{x}{y} - \sqrt N\right | < \frac{1}{2y^2}$. $\square$
By considering the period of the continued fraction of $\sqrt N$, you can pick out the convergents that give solutions. Dirichlet's Unit Theorem merits a quick shout-out at this point - by pushing this line of reasoning further, you can show that the solutions to Pell's equation (corresponding to non-trivial units in $\mathbb{Q}(\displaystyle \sqrt N)$) form a (cyclic!) group, which is mind-blowingly cool.
A second application of continued fractions also exploits the nice properties of square roots - prime factorisation. This application is sort of technical, so I won't go into it right now, although I am planning a future post dedicated to prime factorisation algorithms.
There is a generalisation of continued fractions called (rather obscurely) "
generalised continued fractions". I unfortunately do not have much to say about these at this point in time, other than the $\tan x$ example given in lectures was one of them.
This post has turned out to be massively long, as it was far too ambitious in scope. There are still a few details that I want to investigate in full depth at some point, but it's probably a better idea to do so in another post. I hope that anybody reading enjoyed it, and maybe learned something new.
Footnote: a significant number of these proofs are largely due to my Number Theory lecturer, Dr. Vicky Neale. A couple of them are due to my Topics in Analysis lecturer, Dr. Kovalev.