Showing posts with label number theory. Show all posts
Showing posts with label number theory. Show all posts

Friday, 21 March 2014

Number Theory, Topics in Analysis: Continued fractions and Diophantine approximation

The Number Theory and Topics in Analysis courses complement each other surprisingly well on the topic of continued fractions. In Number Theory, they were introduced ultimately as a means to an end - a technique for solving Pell's equation, and a prime factorisation algorithm. The TinA approach was vastly more contemplative, presenting continued fractions as a (slightly feeble) attempt to understand some aspects of the nature of real numbers.

The rational numbers are dense in the real numbers. Yet there are lots more real numbers than rational numbers. In order to make any sense at all out of this, we need to be less careless. One possible approach is to quantify the cost or effectiveness of a given rational approximation, by measuring the size of the denominator you have to resort to in order to make it happen (compare this to the degree of the polynomials used to uniformly approximate continuous functions).

The results of this idea are ultimately quite surprising. We obtain a hierarchy of types of real number, based on how costly they are to approximate. The real numbers that are hardest to approximate are what we intuitively consider to be the most regular class of reals - the rational numbers themselves. What?

Class of real numbers Measure (per unit) Cardinality Category Cost of approximating
Algebraic numbers Rationals 0 $\aleph_0$ Meagre Disastrous
Irrational algebraics 0 $\aleph_0$ Meagre Meh
Transcendental numbers Diophantine transcendentals 1 $\mathfrak{c}$ Meagre
Non-Diophantine transcendentals 0 $\mathfrak{c}$ Residual Good


The aim of this post is to establish the above results, as well as providing a more general panorama of continued fractions, their properties, and their applications. I will include results and concepts from both courses, so hopefully anybody reading this will enjoy and benefit from a slightly expanded perspective.

Section 0: A word on decimal expansions


We will show later that continued fractions correspond bijectively with real numbers. So, in some sense, they are an alternative way of representing reals, as opposed to the more familiar decimal expansion. The analogy pretty much stops there - as far as I can tell, there is little to gain in comparing the two systems. Still, I want to seize this opportunity to consider formally a couple of properties of decimals that are usually taken for granted, as we will briefly explore some analagous results for continued fractions.

Proposition 0.1 A real number with a repeating decimal expansion is a rational number.

Proof: Let $\alpha = d.d_1d_2d_3\dots d_m\overline{d_{m+1}\dots d_{m+k}}$. Then $10^m\alpha = N + 0.\overline{d_{m+1}\dots d_{m+k}}$, and $10^{m+k}\alpha=M + 0.\overline{d_{m+1}\dots d_{m+k}}$ for some integers $N$, $M$. Thus, $10^{m+k}\alpha - 10^m\alpha = K$ for some integer $K$, giving $\alpha = \frac{K}{10^{m+k}-10^k}$, and so $\alpha \in \mathbb{Q}$. $\square$ (proof found here).

Proposition 0.2 A rational number has a repeating decimal expansion.

Proof: Consider performing the calculation $\frac{p}{q}$ by long division. At each step, the remainder will be in $\{0,1,2,\ldots,q-1\}$. By the pigeonhole principle, a remainder must repeat after $q+1$ steps. But a repeated remainder forces the calculation into a loop. $\square$


Thus, real numbers with repeating decimal expansions correspond to a particular subclass of numbers - the rationals. We will later remark that real numbers with repeating continued fractions also correspond to a interesting subclass - quadratic algebraic numbers.

Section 1: Diophantine approximation


In this section, we consider the problem of finding a rational number close to an arbitrary real number $\alpha$.

As explained in the introduction, we are mainly interested in obtaining lower bounds for the cost. But no lower bound is complete without an upper bound. The following theorem is the much-needed reality check that I'm very surprised didn't come up in TinA - for any real number, the cost is never going to get pathologically bad.

Theorem 1.1 (Dirichlet's theorem) Let $\alpha$ be a real number. For every $N \in \mathbb{N}$ there exists a rational number $\frac{a}{q}$ with $1 \leq q < N$ and $\left| \alpha - \frac{a}{q}\right| \leq \frac{1}{qN}$.

Proof: Consider the fractional part of the $N+1$ distinct numbers $k\theta$, for $0 \leq k \leq N$. Split the unit interval into the $N$ disjoint intervals of length $\frac{1}{N}$ given by $[\frac{j}{N},\frac{j+1}{N}]$, for $0 \leq j < N$.

By the pigeonhole principle, there must be an inteval containing two of the $k\theta$, say $k_1\theta$ and $k_2\theta$, $k_2>k_1$. But then for $q=k_2-k_1$, $|q \theta - a | \leq \frac{1}{N}$ for some integer $a$. By construction, $1\leq q \leq N$. $\square$

Still, let's not get too optimistic about this. With the obvious caveat that any fraction "approximates" its own equivalence class perfectly, rational numbers are awful at approximating other rationals.

Proposition 1.2 If $\alpha = \frac{a}{b} \neq \frac{p}{q}$, then $\left|\alpha - \frac{p}{q}\right| \geq \frac{1}{bq}$.

Proof: $\left|\frac{a}{b}-\frac{p}{q}\right| = \left|\frac{aq-bp}{bq}\right| \geq \frac{1}{bq}$ $\square$

Algebraic numbers are slightly better, but still fairly poor.

Theorem 1.3 (Liouville's theorem) Let $\alpha$ be algebraic. Then $\exists c>0$ (depending on $\alpha$) such that $\left|\alpha - \frac{p}{q}\right| \geq \frac{c}{q^n}$, for every rational $\frac{p}{q}$, where $n$ is the degree of the minimal polynomial over $\mathbb{Z}$.

Proof: In lectures we only assumed that $f$ is an arbitrary polynomial with $\alpha$ as a root. However, using the minimal polynomial gives a better lower bound. Pick $R$ sufficiently large that all of the (real) roots of $f$ are contained inside $[-R+1,R-1]$. $f^\prime$ is continuous on $[-R,R]$, and so bounded, by $M$, say. For every rational $\frac{p}{q} \in [-R,R]$, if $f(\frac{p}{q}) \neq 0$, then $\frac{1}{q^n} \leq \left|f(\frac{p}{q})\right|= \left|f(\frac{p}{q})-f(\alpha)\right| \leq M \left| \frac{p}{q} -\alpha\right|$, by the mean value theorem.

There are only many finitely roots, so $\exists c_0$ such that $|\frac{p}{q}-\alpha| \geq c_0 \geq \frac{c_0}{q^n}$ for all rational roots $\frac{p}{q}$ of $f$ (which we secretly know have to be integers anyway).

Finally, if $\left|\frac{p}{q}\right| \geq R$, then $\left|\frac{p}{q} - \alpha \right| \geq 1 \geq \frac{1}{q^n}$.

So, by taking $c = \min\{\frac{1}{M},1,c_0\}$, we have found the required constant. $\square$

Consider now a new class of real numbers, the diophantine numbers, which we didn't discuss directly in lectures. In order to motivate the definition, recall Roth's theorem, which was stated (incorrectly!) in lectures as an improvement on Liouville's theorem.

Theorem 1.4 (Thue-Siegel-Roth theorem) Let $\alpha$ be algebraic and irrational. Then for all $\varepsilon > 0$, $\exists c > 0$ such that $\left| \alpha - \frac{p}{q}\right| \geq \frac{c}{q^{2+\varepsilon}}$ for any rational number $\frac{p}{q}$.


(In lectures, this was stated as $\forall \varepsilon$, $\exists c$ such that $\forall \alpha \ldots$, which is incorrect - the constant $c$ depends on $\alpha$.)

Definition. Say that $\alpha \in \mathbb{R}$ is Diophantine if $\forall \varepsilon > 0$, $\exists c >0$ such that $\left| \alpha - \frac{p}{q}\right| \geq \frac{c}{q^{2+\varepsilon}}$ for any rational number $\frac{p}{q}$.


In other words, Theorem 1.4 says that every algebraic number is Diophantine. (Note: there may be other names for this class of numbers - they seem to be discussed very little on wiki. My source for this name is Terry Tao's blog. Not to be confused with Diophantine sets, which are discussed on wiki, and are related to recursively enumerable sets.)

Theorem 1.5 The set of Diophantine numbers has full measure, but is meagre.

Proof: This is actually an excellent exercice in both measure theory and linear analysis, so I won't reproduce the proof here, only link to a good exposition of it. $\square$


These results already suffice to establish the table presented in the introduction. Interestingly, we have shown that the set of numbers that you can approximate very well with rationals is small in the sense of measure, but large in the sense of the Baire Category Theorem.

We have also set the stage for a technique for proving transcendentality - if you can prove that a given number violates the Liouville lower bound, or the stronger Roth lower bound, you can deduce that it's transcendental. An example of this is the Liouville number given in lectures, or more generally the uncountable set of Liouville numbers constructed in Example Sheet 4:

Proposition 1.6 For any sequence $a_n$ taking values in $\{1,2\}$, $\sigma = \sum \frac{a_n}{10^{n!}}$ is transcendental.


Open question (that I want to come back to, but haven't had time to think about yet) - are the above Liouville numbers Diophantine?

Section 2: Continued fractions


We will define continued fractions using the continued fraction algorithm, define the convergent sequences $p_n$ and $q_n$ independently, and then proceed to derive the link between the two. This presentation is backwards compared to the one given in TinA, but it avoids all of the tautologies that everyone was complaining about.

Given a real number $\theta$, we obtain a sequence $a_0,a_1,\ldots$ recursively by applying the following algorithm.

Set $a_0=\lfloor \theta \rfloor$.
Define $\alpha_1 = \frac{1}{\theta - a_0}$ (so $\theta = a_0 + \frac{1}{\alpha_1}$).
Define $a_n=\lfloor \alpha_n \rfloor$, and set $\alpha_{n+1}=\frac{1}{\alpha_n - a_n}$.
If we ever encounter $a_n=\alpha_n$, we terminate the process.

We write $[a_0;a_1,a_2,\ldots]$ for the continued fraction with value $\alpha$. The partial continued fractions $[a_0]$, $[a_0;a_1]$, $[a_0;a_1,a_2]$, $\ldots$ are called the convergents of $\alpha$.

By construction, $[a_0;a_1] = a_0 + \frac{1}{a_1}$, $[a_0;a_1,a_2]= \displaystyle a_0 \frac{1}{a_1 + \frac{1}{a_2}}$, etc, so we recover the characteristic tower form of the continued fraction.

Note that $a_n \in \mathbb{N}$, except for $a_0$, which is in $\mathbb{Z}$.

Proposition 2.1 The continued fraction algorithm terminates $\Leftrightarrow$ $\theta$ is rational.

Proof: If the algorithm terminates, then $\theta=[a_0;a_1,\ldots,a_n]$ for some $a_n$. By multiplying out the denominators in the tower form, it is obvious that $\theta$ is rational.

If $\theta$ is rational, then the $\alpha_i$ constructed in the algorithm are also rational. But by construction, their denominators are strictly decreasing, so the process must terminate. $\square$


Obviously, the first thing we want to show is that the convergents converge to $\alpha$ under the Euclidean metric. However, as things currently stand, the partial continued fractions are unwieldy objects. We introduce two intermediate sequences which are easier to manipulate, and which we will show contain all of the information given by the convergents.


Define the sequences:
$p_0 = a_0$ $p_1=a_0a_1+1$ $p_n=a_np_{n-1}+p_{n-2}$
$q_0 = 1$ $q_1=a_1$ $q_n=a_nq_{n-1}+q_{n-2}$

[Note to self: investigate link with Euclidean division algorithm.]

Here are some elementary properties of this sequence, which we will use later on.

Proposition 2.2

  1. For each $n \geq 1$, $p_nq_{n-1}-p_{n-1}q_n=(-1)^{n+1}$.
  2. $p_n$ and $q_n$ are coprime.

ProofFor the first statement, we induct on $n$.

When $n=1$, $p_1q_0-p_0q_1=a_0a_1+1-a_0a_1=1$.

Suppose the relation holds for $n$, where $n \geq 1$.

Then $p_{n+1}q_n-p_nq_{n+1}=(a_{n+1}p_n+p_{n-1}q_n)q_n - p_n(a_{n+1}q_n+q_{n-1}) = p_{n-1}q_n-p_nq_{n-1}=(-1)^{n+2}$.

The second statement follows directly from the first (think of Bezout). $\square$


Proposition 2.3 $\frac{p_n}{q_n}=[a_0;a_1,\ldots,a_n]$ in its lowest form.

Proof: Let us relax our notation slightly to allow $\alpha_n$, the last term of $[a_0;a_1,\ldots,\alpha_n]$, to be an arbitrary (positive) real number. Under this notation, we can write $[a_0;a_1,\ldots,a_n,\alpha_{n+1}]=[a_0;a_1,\ldots,a_n+\frac{1}{\alpha_{n+1}}]$. This will hopefully make the calculations slightly clearer.

Observe that the claim is true for $n=0,1,2$ (this involves multiplying out the tower form of $[a_0;a_1,\alpha_2]$, and is absolutely trivial).

Assume that the claim is true for $n$.
$$\displaystyle \frac{p_{n+1}}{q_{n+1}} = \frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}} = \frac{p_n + \frac{1}{\alpha_{n+1}}p_{n-1}}{q_n+\frac{1}{\alpha_{n+1}}q_{n-1}}= \frac{(a_np_{n-1}+p_{n-2})+\frac{p_{n-1}}{\alpha_{n+1}}}{(a_nq_{n-1}+q_{n-2})+\frac{q_{n-1}}{\alpha_{n+1}}} = \frac{\left(a_n+\frac{1}{\alpha_{n+1}}\right)p_{n-1}+p_{n-2}}{\left(a_n+\frac{1}{\alpha_{n+1}}\right)q_{n-1}+q_{n-2}}=[a_0;a_1,\ldots,a_n+\frac{1}{\alpha_{n+1}}]=[a_0;a_1,\ldots,a_n,\alpha_{n+1}]$$
We used the induction hypothesis in the second-last equality.
We have already shown above that $p_n$, $q_n$ are coprime, so the fraction is necessarily in its lowest form. $\square$

We now know enough to go ahead and prove that continued fractions converge to their value.

Proposition 2.4 If $[a_0;a_1,a_2,\ldots]$ is the continued fraction obtained from $\theta$ using the division algorithm, then $[a_0;a_1,\ldots,a_n] \to \theta$ as $n \to \infty$.

Proof: Using the relaxed notation from the above proposition, we can write $\theta = [a_0;a_1,\ldots,a_n,\alpha_{n+1}]$, which we know is equal to $\frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}}$. So:

$$\left| \theta - \frac{p_n}{q_n}\right| = \left|\frac{\alpha_{n+1}p_n+p_{n-1}}{\alpha_{n+1}q_n+q_{n-1}}-\frac{p_n}{q_n}\right| = \frac{|p_nq_{n-1}-p_{n-1}q_n|}{q_n(\alpha_{n+1}q_n+q_{n-1})}=\frac{1}{q_n(\alpha_{n+1}q_n+q_{n-1})}$$

But by considering the recurrence relation, we see that $q_n$ is a strictly increasing sequence of positive integers. The result follows. $\square$

We have now laid the groundwork for our study of Diophantine approximation via continued fractions. We are not just interested in the fact that continued fractions converge to their value, but also in some intriguing features of their convergence. We will show that continued fractions provide the most cost-effective way to converge to any real, and, remarkably, that the continued fraction algorithm necessarily produces every good approximation - in other words, if a given approximation is particularly cheap, it will naturally arise as one of the convergents defined above.

Firstly, we show that, considered absolutely, the convergence is monotone - every convergent gives a better approximation than the last.

Lemma 2.5 For each $n$, $\frac{1}{q_{n+2}} \leq | q_n \theta - p_n | \leq \frac{1}{q_{n+1}}$. This immediately implies that $|q_n\theta - p_n | \leq |q_{n-1} \theta - p_{n-1}|$, which further implies that $|\theta - \frac{p_n}{q_n}| \leq |\theta - \frac{p_{n-1}}{q_{n-1}}|$.

Proof: The key observation is that $a_{n+1} \leq \alpha_{n+1} \leq a_{n+1} + 1$. This gives $q_{n+1} = a_{n+1}q_n + q_{n-1} \leq \alpha_{n+1}q_n + q_{n-1} \leq (a_{n+1}+1)q_n+q_{n-1} = q_{n+1}+q_n \leq a_{n+2}q_{n+1}+q_n = q_{n+2}$.

Recalling that $|q_n \theta - p_n | = \frac{1}{\alpha_{n+1}q_n+q_{n-1}}$, from the proof of the previous proposition, we are done. $\square$

However, the partial continued fractions oscillate around their limit $\theta$, via the following observation.

Lemma 2.6 For each $n$, $\theta - \frac{p_n}{q_n}$ and $\theta - \frac{p_{n-1}}{q_{n-1}}$ have opposite sign.

Proof: Go back to the proof of the above lemma, and observe that without the mod signs, we obtain $\theta - \frac{p_n}{q_n} = \frac{(-1)^n}{q_n(\alpha_{n+1}q_n+q_{n-1})}$. $\square$

As promised, we now prove that the convergents are the best approximants for a fixed cost.


Theorem 2.7 For any integers $p$ and $q$ such that $0 < q < q_n$, $|q\theta - p| \geq |q_n\theta - p_n|$.

Proof: Let $\begin{pmatrix} u \\ v \end{pmatrix} = \begin{pmatrix} q_{n-1} & p_{n-1} \\ q_n & p_n \end{pmatrix}\begin{pmatrix} p \\ q \end{pmatrix}$. Note that this matrix has determinant $\pm 1$, so is invertible (its inverse has integer coefficients). Therefore, we can write $p = up_{n-1}+vp_n$ and $q = uq_{n-1}+vq_n$ for some integers $u,v$.

So, $|q\theta - p| = |uq_{n-1} \theta + v q_n \theta - u p_{n-1} - vp_n | = |u (q_{n-1} \theta - p_{n-1}) + v(q_n \theta - p_n)|$.

However, we assumed that $0 < q < q_n$, so $0 < uq_{n-1} + vq_n < q_n$. This forces $u$ and $v$ to have opposite signs (if $v=0$ then $\frac{p}{q}=\frac{p_n}{q_n}$). Since $q_{n-1}\theta - p_{n-1}$ and $q_n\theta - p_n$ also have opposite signs by the above lemma, we can write:

$|q\theta - p| = |u(q_{n-1}\theta-p_{n-1})|+|v(q_n\theta - p_n)| \geq |q_{n-1}\theta - p_{n-1}|$. $\square$

And finally, if any approximation is particularly cost-efficient, it must in fact be a convergent.

Corollary 2.8 For any integer $p$ and positive integer $q$, if $|\theta - \frac{p}{q}| < \frac{1}{2q^2}$, then $\frac{p}{q}$ is one of the convergents of $\theta$.

Proof: Assume WLOG that $p$ and $q$ are coprime. Pick $n$ such that $q_n \leq q < q_{n+1}$ (recall the $q_n$ are a strictly increasing sequence of positive integers, and $q_0=1$, so we can do this).

Then $\left| \frac{p}{q} -\frac{p_n}{q_n}\right| \leq \left| \theta - \frac{p}{q}\right| + \left| \theta - \frac{p_n}{q_n}\right| < \frac{1}{2q^2} + \frac{1}{q_n}|q\theta - p| < \frac{1}{2qq_n}+\frac{1}{q_n}\frac{1}{2q} \leq \frac{1}{qq_n}$.

So, $\left|\frac{p}{q}-\frac{p_n}{q_n}\right| = \frac{|pq_n-p_nq|}{qq_n} < \frac{1}{qq_n}$, so $|pq_n-p_nq| < 1$.

We conclude that $pq_n-p_nq=0$, so $\frac{p}{q}=\frac{p_n}{q_n}$. $\square$

The whole point of this section was to show that continued fractions do indeed provide insight about Diophantine approximation. The above result achieves this pretty spectacularly.

Section 3: Examples and applications

In general, computing the continued fraction of an arbitrary real is not very interesting - it is simply a matter of transforming the information we possess about that real (usually in the form of a decimal expansion) into the language of continued fractions.

If our chosen real number is rational, the algorithm terminates - which is hardly interestingly. Conversely, we have shown that if our real number is irrational it doesn't. We might therefore naively expect to have to do an infinite amount of work in order to achieve the full expansion. Tripos questions can be nasty, but hopefully not that nasty (even when PTJ is writing the questions).

However, in lectures, we saw the tantalising example of $\sqrt{2}$. Very quickly, the algorithm developped a loop, and so we could observe that the continued fraction expansion must necessarily repeat. On Example Sheet 4, we are asked to calculate the full expansion of $\sqrt{3}$, and so we might hope that something similiar is going to occur.

This generalises. There is a full characterisation of numbers with repeating continued fractions (much like section 0 gives a full characterisation of numbers with repeating decimal expansion), given by the following theorem.

A quadratic irrational is a root of a quadratic equation with integer coefficients. In the Number Fields course, we show that any such number must be of the form $a+b\sqrt{N}$ for some (square-free) integer $N$, and rationals $a$ and $b$.

Theorem 3.1 (Lagrange) For any real number $\theta$, $\theta$ is a quadratic irrational $\Leftrightarrow$ its continued fraction is (eventually) periodic.

For square roots, we can say more. The following theorem gives an excellent error-detecting failsafe for those awkward times when you find yourself needing to calculate the continued fraction of a square root in real time at the supermarket check-out.

Theorem 3.2 Let $N$ be a natural number that is not a square. Then its continued fraction is of the form $[a_0;\overline{a_1,a_2,\ldots,a_2,a_1,2a_0}]$. Thus, it is both periodic and symmetric, and each period ends on twice the integer part.

The proofs of these theorems were not given in my course. However, I have an idea which might suffice to prove a watered-down version of 3.2. Watch this space.

This checks out with the excruciatingly simple example given in TinA: $\sqrt{2} = [1;\overline{2}]$. For the eager beavers out there, you can check that this is coherent with your calculations for $\sqrt{3}$. (On a Number Theory example sheet, we had to calculate $\sqrt{46}$, which is 12-periodic, and thus somewhat sadistic. Side-note: to find a higher period you have to go up to $\sqrt{94}$. Coincidence? I think not.)

One excellent application for which continued fractions are crucial is Pell's equation.

Let $N$ be a natural number that is not a square. Pell's equation is given by:
$$x^2-Ny^2=1$$


Pell's equation is useful for example in Number Fields, where solutions correspond to units of a given real quadratic field extension (the norm of $x+y \sqrt N$ is defined to be $x^2-Ny^2$).

We can use continued fractions to prove that Pell's equation can always be solved. We won't go through the method fully here, as it largely irrelevant to the TinA course, but let us at least make the following observation.

Proposition 3.3 Any solution to Pell's equation must be a convergent of $\sqrt N$.

Proof: $(x - y \sqrt N )(x + y \sqrt N) = 1$. WLOG, we can assume $x$ and $y$ are both positive.
This gives $x-y\sqrt N=\frac{1}{x+y\sqrt N} < \frac{1}{2y\sqrt N}<\frac{1}{2y}$, so $\left |\frac{x}{y} - \sqrt N\right | < \frac{1}{2y^2}$. $\square$

By considering the period of the continued fraction of $\sqrt N$, you can pick out the convergents that give solutions. Dirichlet's Unit Theorem merits a quick shout-out at this point - by pushing this line of reasoning further, you can show that the solutions to Pell's equation (corresponding to non-trivial units in $\mathbb{Q}(\displaystyle \sqrt N)$) form a (cyclic!) group, which is mind-blowingly cool.

A second application of continued fractions also exploits the nice properties of square roots - prime factorisation. This application is sort of technical, so I won't go into it right now, although I am planning a future post dedicated to prime factorisation algorithms.

There is a generalisation of continued fractions called (rather obscurely) "generalised continued fractions". I unfortunately do not have much to say about these at this point in time, other than the $\tan x$ example given in lectures was one of them.

This post has turned out to be massively long, as it was far too ambitious in scope. There are still a few details that I want to investigate in full depth at some point, but it's probably a better idea to do so in another post. I hope that anybody reading enjoyed it, and maybe learned something new.

Footnote: a significant number of these proofs are largely due to my Number Theory lecturer, Dr. Vicky Neale. A couple of them are due to my Topics in Analysis lecturer, Dr. Kovalev.

Saturday, 15 March 2014

Number Theory - Deriving Quadratic Reciprocity

There are allegedly dozens of proofs of quadratic reciprocity. Amazingly, not a single one of them is nice. (The undergraduate's definition of nice is something that can be easily remembered/regurgitated without thinking. All of the proofs of quadratic reciprocity that I have read involve pretty mind-bending arguments.)

This section involves extremely fiddly, annoying (or, put another way, marvellously creative, beautiful) mathematics. We need to be ready to get ours hands dirty, constructing bijections/partitions/relations between residue classes based on obscure/unwieldy properties. Good practice for tripos questions, I guess.

Lemma 1.1 There are exactly $\frac{p-1}{2}$ quadratic residues $\mod p$.

Proof: Take a primitive element, $g$. The quadratic residues are exactly the even powers of $g$.
Proof 2: $(-a)^2=a^2 \mod p$, so there are at most $\frac{p-1}{2}$ quadratic residues. $x^2 \equiv y^2 \mod p$ implies that $x \equiv \pm y \mod p$ (integral domain), so there are at least $\frac{p-1}{2}$ quadratic residues. $\square$

Theorem 1.2 (Euler's criterion) $a^{\frac{p-1}{2}} \equiv \left(\frac{a}{p}\right) \mod p$.

Proof: We will combine both proofs of the above lemma. We can assume $a$ coprime to $p$, as otherwise the claim is obvious (both sides are $0$).
First, note that $a^{p-1}\equiv 1 \mod p$ by Fermat's little theorem. Since we are working in an integral domain, $a^{\frac{p-1}{2}} \equiv \pm 1 \mod p$.
Let $g$ be a primitive element $\mod p$. Then $a=g^i$ for some $i$. If $i=2j$, then $a^{\frac{p-1}{2}}\equiv g^{(p-1)j} \equiv 1 \mod p$. Similarly, if $i$ is odd, then $a^{\frac{p-1}{2}} \equiv -1 \mod p$. Thus we have established that:
$$a^{\frac{p-1}{2}} = \left\{\begin{array}{ll} +1 & \text{if } a \text{ is an even power of } g \\ -1 & \text{if } a \text{ is an odd power of } g \end{array}\right.$$
From proof 1 above, this corresponds exactly to the value of the Legendre symbol $\left(\frac{a}{p}\right)$. $\square$

Proposition 1.3 (Gauss's lemma)

We introduce least absolute value notation for a residue class mod $p$: write $\langle a \rangle$ for the unique integer congruent to $a \mod p$ lying in $[-p/2,p/2]$.

If $p$ is an odd prime, and $a$ is coprime to $p$, then $\left(\frac{a}{p}\right) = (-1)^\gamma$, where $\gamma = \#\{k \in \{1,2,\ldots,\frac{p-1}{2}\} \mid \langle ka \rangle < 0 \}$.

Proof: Observe that $\langle a \rangle, \langle 2a \rangle, \ldots, \langle \frac{p-1}{2}a\rangle$ are $\pm 1, \pm 2, \ldots, \pm \frac{p-1}{2}$ in some order, for some (fixed) choices of sign.

Hence, $(\frac{p-1}{2})!a^{\frac{p-1}{2}} \equiv \langle a \rangle \langle 2a \rangle \dots \langle (\frac{p-1}{2})a \rangle \equiv (-1)^\gamma (\frac{p-1}{2})! \mod p$.

By cancelling, $a^{\frac{p-1}{2}} \equiv (-1)^\gamma \mod p$. By Euler's criterion, $\left(\frac{a}{p}\right) \equiv (-1)^\gamma \mod p$. Since $(-1)^\gamma$ and $\left(\frac{a}{p}\right)$ are in $\{-1,1\}$, congruence implies equality. $\square$

Theorem 1.4 (Quadratic reciprocity)

Let $p$ and $q$ be odd primes. Then $\left(\frac{q}{p}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}\left(\frac{p}{q}\right)$.

If we assume $p$ and $q$ to be distinct, we can reformulate as $\left(\frac{q}{p}\right)\left(\frac{p}{q}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}$.

Proof: To prove quadratic reciprocity, we use Gauss's lemma to rephrase the above assertion as the geometric problem of counting lattice points (points with integer coefficients in the plane) within specific regions of interest.

By the lemma, $\left(\frac{q}{p}\right)=(-1)^\gamma$, where $\gamma$ is as defined above.
We need to count how many of $\langle q \rangle, \langle 2q \rangle, \ldots, \langle \frac{p-1}{2} q \rangle$ are negative. (Note: we are working mod $p$.)

Observe that setting $\langle kq \rangle = kq - lp$ determines $l$ uniquely.
Thus, $\gamma$ corresponds to the number of lattice points $(k,l)$ within the region $-\frac{p}{2} < kq - lp < 0$. [more justification required]

Analagously, writing $\left(\frac{p}{q}\right)=(-1)^\beta$, Gauss's lemma says that $\beta$ corresponds to the number of lattice points $(k,l)$ within the region $-\frac{q}{2}< lp-kq < 0$.

Drawing the three (parallel) lines $kq-lp=-\frac{p}{2}$, $kq-lp=0$, and $kq-lp=\frac{q}{2}$ splits up the rectangle $0<k<p/2$, $0<l<q/2$ into four regions. From top left to bottom right, let's call them $A$ (a triangle), $B$ and $C$ (rhombuses), and $D$ (a triangle).


The total number of lattice points in the rectangle is $\frac{p-1}{2}\frac{q-1}{2}$ (recall that we are using strict inequality, so excluding those that fall on the axes). To prove quadratic reciprocity, we need the number of lattice points in $B$ and $C$ combined to have the same parity as $\frac{p-1}{2}\frac{q-1}{2}$.

We need two observations to see that this is true.
  1. There is a bijection between $A$ and $D$. 
  2. None of the lattice points are on any of the three lines that we drew. The lecturer totally failed to discuss this point, but it is crucial - given that we are considering strict inequality, if a lattice point was on one of the lines it wouldn't be counted in any of the regions (and if we were considering non-strict inequality, it would be counted twice). Luckily, this can't happen as $p$ and $q$ are prime. 
Combining these two observations means that $|B|+|C|+2|A| = \frac{p-1}{2}\frac{q-1}{2}$, and so $|B| + |C|$ and $\frac{p-1}{2}\frac{q-1}{2}$ have the same parity. From Gauss's lemma, above, we showed that $\left(\frac{q}{p}\right)\left(\frac{p}{q}\right) = (-1)^\gamma(-1)^\beta=(-1)^{|B|+|C|}$, and the statement of quadratic reciprocity is $\left(\frac{q}{p}\right)\left(\frac{p}{q}\right) = (-1)^{\frac{p-1}{2}\frac{q-1}{2}}$, so we are done.

Let's prove 2 first. Suppose a point $(k,l)$ is on the line $kq-lp=0$, and that $k$ is an integer. Then $l=k\frac{q}{p}$, so $l$ isn't an integer. Similarly, if $k$ is an integer, then $l=k\frac{q}{p}+\frac{1}{2}$ isn't, and neither is $l=\frac{q}{p}(k-\frac{1}{2})$. Phew.

For 1, the required bijection is given by $(k,l) \leftrightarrow \left(\frac{p+1}{2}-k, \frac{q+1}{2}-l\right)$ (the same formula works both ways, from $A$ to $D$ and from $D$ to $A$). Visually, you pick up the bottom right triangle, rotate it 180 degrees, and slap it back onto the top right triangle. It doesn't fit together exactly, but instead overlaps egregiously, so it's really quite remarkable that this works. The lecturer left it as an exercice to prove that this bijection indeed does the job (I'm getting the impression that this proof is unlikely to come up in the exam).

The map is clearly injective from the above observation that the formula is self-inverse, i.e. applying it twice recovers the same point. So we only need to check that this map is well-defined, i.e. maps $C$ into $D$, and is surjective. Well-definedness is easy to obtain from definition-chasing. There is no extra work in obtaining surjectivity - it follows from well-definedness in the other direction. So we are done. $\square$





Number Theory - The structure of multiplicative groups of integers

This post aims to establish the basics about multiplicative groups as clearly and as efficiently as possible. I will include the proofs given in lectures to start off with, but I plan to trawl around for better alternatives.

0. Elementary stuff you should know how to prove by now
  • Division algorithm for integers/polynomials in an integral domain
  • Bezout
  • Existence and uniqueness of prime factorisation
1. For $p$ prime, $\left(\mathbb{Z}/p\mathbb{Z}\right)^\times$ is cyclic

In other words, there always exists a primitive element $\mod p$ for $p$ prime. This theorem is very well-known, and feels trivial. It's not - there is a lot of groundwork that simply can't be avoided.


Intermediate results:

1.1 Lagrange's theorem for polynomials


In an integral domain, a polynomial of degree $d$ can have at most $d$ roots.


Proof: 

We induct on the degree of the polynomial.
For $n=1$, write $p(x)=ax+b$, $a \neq 0$.
Suppose there are two distinct solutions, $x$ and $y$. Then, by subtracting, $a(x-y)=0$. In an integral domain, this implies $x=y$.

For the inductive step, the division algorithm for polynomials (which is valid in an integral domain) implies that for any root $x_0$ of a polynomial $p$, $p$ can be factored as $(x-x_0)g(x)$. In an integral domain, the degree of a product of polynomials is the sum of the degrees of the factors. Thus, $p$ has at most one more root than $g$, and so at most $n$ roots in total, by the induction hypothesis. $\square$


1.2 For $p$ prime, $F = \mathbb{Z}/p\mathbb{Z}$ is a field

Proof:

Pick any non-zero element $a$. We need to find an inverse for $a$.
Define a map $\phi_a : F \to F$ by $\phi_a(z)=az$. This is an injection, as $F$ is an integral domain (immediate from the definition of a prime number). An injection from a finite set to itself must necessarily be a surjection. Thus, $ab=1$ for some $b \in F$, proving that $a$ has a multiplicative inverse. $\square$

1.3 The Euler totient function $\phi$ is multiplicative

Note: $\phi$ is not totally multiplicative. E.g. $\phi(2)=1$, $\phi(4)=2$.

We define $\phi(n)$ as the number of positive integers less than $n$ that are coprime to $n$, and observe that Bezout implies that $\phi(n)= \left| (\mathbb{Z}/n\mathbb{Z})^\times \right|$.


Proof:
The Chinese Remainder Theorem gives a bijection (more precisely, an isomorphism) between $\left(\mathbb{Z}/m_1m_2\mathbb{Z}\right)^\times$ and $\left(\mathbb{Z}/m_1\mathbb{Z}\right)^\times\times\left(\mathbb{Z}/m_2\mathbb{Z}\right)^\times$ whenever $m_1$ and $m_2$ are coprime, giving $\phi(m_1m_2)=\phi(m_1)\phi(m_2)$. $\square$

1.4 The Eulier totient function has the property: $\displaystyle \sum_{d \mid n} \phi(d) = n$.

Proof:
We establish the fact that for any multiplicative function $\varphi$, $F(n) = \displaystyle \sum_{d \mid n} \varphi(d)$ is also multiplicative.1
Let $m$ and $n$ be coprime. Then:
$$F(mn)=\displaystyle \sum_{d \mid m_1} \varphi(d) = \sum_{d_1 \mid m, d_2 \mid n} \varphi(d_1d_2)$$$$ = \sum_{d_1 \mid m, d_2 \mid n} \varphi(d_1)\varphi(d_2) = \left(\sum_{d_1 \mid m} \varphi(d_1)\right)\left(\sum_{d_2 \mid n} \varphi(d_2)\right)$$$$=F(m)F(n)$$
Observe that $\phi(p^k)=p^{k-1}(p-1)$, so $F_\phi(p^k)=1 + (p-1) + \dots + (p^k-p^{k-1}) = p^k$. Thus, $F_\phi(n)=F_\phi(p_1^{n_1}\dots p_k^{n_k})=p_1^{n_1}\dots p_k^{n_k}=n$. $\square$

Proof of 1.

For each divisor $d$ of $p-1$, we show that if there is an element of order $d$, there must in fact be (exactly) $\phi(d)$ elements of order $d$. Thus, if $S_d$ is the set of elements of order $d$, then $|S_d|=\phi(d)$, or $|S_d|=0$. Since $\displaystyle \sum_{d \mid p-1} (\phi(d) - |S_d|) = p-1 - (p-1) = 0$, it follows that $|S_d|=\phi(d)$ for each $d$ dividing $p-1$ (if any of the $|S_d|$ are empty, the above formula can't hold), and so in particular $S_{p-1}$ is non-empty.


Let $a$ be an element of order $d$. By Lagrange's theorem $x^{d} \equiv 1 \mod p$ has at most $d$ solutions. Any element of order $d$ is a solution to this equation. But we can list $d$ distinct solutions: $1,a,a^2,\ldots,a^{d-1}$. Observe that $a^k$ has order $d$ exactly when $(d,k)=1$. Thus, there are exactly $\phi(d)$ elements of order $d$. $\square$

We call the elements of order $p-1$ the primitive roots $\mod p$. From the above, there are $\phi(p-1)$ of them. 

2. For $p$ an odd prime, $\left(\mathbb{Z}/p^k\mathbb{Z})\right)^\times$ is cyclic for any $k$.

We prove this by exhibiting an element of order $p^{k-1}(p-1)$. 
Let $g_1$ be a primitive element of  $\left(\mathbb{Z}/p\mathbb{Z})\right)^\times$.
Compute $g_1^{p-1}-1 \mod p^2$. If it's zero, set $g=g_1+p$. If it's non-zero, set $g=g_1$.
Then $g$ generates $\left(\mathbb{Z}/p^k\mathbb{Z})\right)^\times$.

We have chosen $g$ such that $g^{p-1}=1+bp$, where $b$ is coprime to $p$.
Thus, $g^{p^{k-1}(p-1)}=(1+bp)^{p^{k-1}}=1+bp{p^{k-1} \choose 1}+ \dots \equiv 1 \mod p^{k}$. So the order of $g$ (mod $p^k$) divides $p^{k-1}(p-1)$, and doesn't divide $p-1$.
Noting that $g^{p^{k-2}(p-1)}=(1+bp)^{p^{k-2}}=1+bp{p^{k-2} \choose 1} + (bp)^2{p^{k-2} \choose 2} + \dots \not\equiv 1 \mod p^{k}$, we can conclude that $g$ has order exactly $p^{k-1}(p-1)$, and so $\left(\mathbb{Z}/p^k\mathbb{Z})\right)^\times$ is cyclic. $\square$

3. Reserved space for other relevant results/properties proved on example sheets


1 Note to self: it feels like there is something more to be said here. Link to lattices?