A Blueprint for Fermat’s Last Theorem

5 An example of an automorphic form

5.1 Introduction

The key ingredient in Wiles’ proof of Fermat’s Last Theorem is a modularity lifting theorem, sometimes called an \(R=T\) theorem. For Wiles, the \(R\) came from elliptic curves, the \(T\) came from classical modular forms, and the fact that they’re equal is basically the Shimura–Taniyama–Weil conjecture, now known as the Breuil–Conrad–Diamond–Taylor modularity theorem: any elliptic curve over the rationals is modular.

At the heart of the proof we shall formalise is also an \(R=T\) theorem, however the \(T\) which we shall use will be associated not to classical modular forms, but to spaces of more general automorphic forms called quaternionic modular forms. Those of you who know something about classical modular forms might well know that the groups \(\operatorname{SL}_2(\mathbb {R})\) and \(\operatorname{SL}_2(\mathbb {Z})\) are intimately involved; these are the norm 1 units in the matrix rings \(M_2(\mathbb {R})\) and \(M_2(\mathbb {Z})\). In the theory of quaternionic modular forms, the analogous groups are the norm 1 units in rings such as Hamilton’s quaternions \(\mathbb {R}\oplus \mathbb {R}i\oplus \mathbb {R}j\oplus \mathbb {R}k\), and subrings such as \(\mathbb {Z}\oplus \mathbb {Z}i\oplus \mathbb {Z}j\oplus \mathbb {Z}k\).

One of the main goals of the FLT project at the time of writing this sentence, is formalising the statement of the modularity lifting theorem which we shall use. So we are going to need to develop the theory of quaternionic modular forms, which is rather different to the theory of classical modular forms (for example, in the cases we need, the definition is completely algebraic; there are no holomorphic functions in sight, and the analogue of the upper half plane in the quaternionic theory is a finite set of points).

We could just launch into the general theory over totally real fields, which will be the generality which we’ll need. But when I was a PhD student, I learnt about these objects by playing with explicit examples. So, whilst not logically necessary for the proof, I thought it would be fun, and perhaps also instructional, to compute a concrete example of a space of quaternionic modular forms. The process of constructing the example might even inform what kind of machinery we should be developing in general. Let’s begin by discussing the quaternion algebra we shall use.

5.2 A quaternion algebra

Let’s define \(D\) to be the quaternion algebra \(\mathbb {Q}\oplus \mathbb {Q}i\oplus \mathbb {Q}j\oplus \mathbb {Q}k\). As a vector space, \(D\) is 4-dimensional over \(\mathbb {Q}\) with \([1,i,j,k]\) giving a basis. It has a (non-commutative)ring structure, with multiplication satisfying the usual quaternion algebra relations \(i^2=j^2=k^2=ijk=-1\). You can think of \(D\) as an analogue of \(2\times 2\) matrices with rational coefficients, hence its units \(D^\times \) are an analogue of the group \(\operatorname{GL}_2(\mathbb {Q})\).

We will also need an analogue of the group \(\operatorname{GL}_2(\mathbb {Z})\), which will come from an integral structure on \(D\). We choose the Hurwitz order, namely the subring \(\mathcal{O}:=\mathbb {Z}\oplus \mathbb {Z}i\oplus \mathbb {Z}j\oplus \mathbb {Z}\omega \), where \(\omega =\frac{-1+(i+j+k)}{2}\), a cube root of unity, as \((i+j+k)^2=-3\). The simplest way to understand \(\mathcal{O}\) is that it’s quaternions \(a+bi+cj+dk\) where either \(a,b,c,d\) are all integers or are all in \(\frac{1}{2}+\mathbb {Z}\).

Note that \(\mathcal{O}\) is a maximal order and a Euclidean domain, which is why we prefer it over the more obvious sublattice \(\mathbb {Z}\oplus \mathbb {Z}i\oplus \mathbb {Z}j\oplus \mathbb {Z}k.\)

In this chapter, we are going to compute a complex vector space which could be called something like the “weight 2 level 2 modular forms for \(D^\times \)”. The main result will be that this space is 1-dimensional.

Note that mathlib has modular forms, but it doesn’t have enough complex analysis to deduce that the space of modular forms of a given weight and level is finite-dimensional. If all the ‘sorry‘s in this chapter are completed before mathlib gets the necessary complex analysis, then the first nonzero space of modular forms to be proved finite-dimensional in Lean will be a space of quaternionic modular forms.

We will use a modern "adelic" definition of our modular forms, so the first thing we need to do is to talk about profinite completions.

5.3 \(\widehat{\mathbb {Z}}\)

Classically automorphic forms were defined as functions on symmetric spaces (like the upper half plane) which transformed well under the action of certain discrete groups (for example \(\operatorname{SL}_2(\mathbb {Z})\)). However such definitions became combinatorially problematic when generalised to number fields with nontrivial class group, because the classical theory needed a number \(p\) to define the Hecke operator \(T_p\), and in the case where \(p\) was a non-principal prime ideal in a number field, there was no appropriate number. One fix is to take disjoint unions of symmetric spaces indexed by the ideal class group of the field in question, but it is easier to work adelically, which is morally what we shall do. However we are able to avoid introducing the adeles explicitly; we can work instead with the conceptually simpler object \(\widehat{\mathbb {Z}}\), the profinite completion of \(\mathbb {Z}\). So what is \(\widehat{\mathbb {Z}}\)? We offer a low-level definition of this object.

Given an integer \(z\), we can reduce it mod \(N\) for every positive natural number and get elements \(z_N=\overline{z}\in \mathbb {Z}/N\mathbb {Z}\). These elements are not completely arbitrary though – they must satisfy some compatibility conditions. For example there can be no positive integer \(z\) such that \(z_{10}=6\) and \(z_2=1\), because \(z_{10}=6\) tells us that \(z\) ends in a 6 when written in base 10, and in particular it’s even, so \(z_2\) must be 0. The general rule: if \(D\mid N\) then \(z_D\) must be equal to image of \(z_N\) under the natural ring homomorphism from \(\mathbb {Z}/N\mathbb {Z}\) to \(\mathbb {Z}/D\mathbb {Z}\). We say that a collection of elements \(z_N\in \mathbb {Z}/N\mathbb {Z}\) is compatible if it satisfies this rule.

Definition 5.1
#

The profinite completion \(\widehat{\mathbb {Z}}\) of \(\mathbb {Z}\) is the set of all compatible collections \(c=(c_N)_N\) of elements of \(\mathbb {Z}/N\mathbb {Z}\) indexed by \(\mathbb {N}^+:=\{ 1,2,3,\ldots \} \). A collection is said to be compatible if for all positive integers \(D\mid N\), we have \(c_N\) mod \(D\) equals \(c_D\).

Lemma 5.2
#

\(\widehat{\mathbb {Z}}\) is a subring of \(\prod _{N\geq 1}(Z/N\mathbb {Z})\) and in particular is a ring.

Proof

Follow your nose.

Examples of elements of \(\widehat{\mathbb {Z}}\) are are given by integers, where we define \(z_N\) to be \(z\) mod \(N\) for all \(N\). This gives us a natural map from \(\mathbb {Z}\) to \(\widehat{\mathbb {Z}}\). In particular we can talk about \(0\in \widehat{\mathbb {Z}}\) and \(1\in \widehat{\mathbb {Z}}\).

Lemma 5.3
#

\(0\not=1\) in \(\widehat{\mathbb {Z}}\).

Proof

Recall that you can evaluate an element of \(\widehat{\mathbb {Z}}\) at a positive integer. Evaluating \(0\) at 2 gives \(0\), and evaluating \(1\) at \(2\) gives \(1\), and these are distinct elements of \(\mathbb {Z}/2\mathbb {Z}\), so \(0\not=1\) in \(\widehat{\mathbb {Z}}\).

Lemma 5.4
#

The map from the naturals into \(\widehat{\mathbb {Z}}\) sending \(n\) to \(n\) is injective.

Proof

Generalise the above idea. Feel free to write up a LaTeX proof and PR it.

Note that it follows easily that that the map from the integers to \(\widehat{\mathbb {Z}}\) is injective.

But \(\widehat{\mathbb {Z}}\) is much larger than \(\mathbb {Z}\); it has the same cardinality as the reals in fact. Let’s write down an explicit example of an element of \(\widehat{\mathbb {Z}}\) which isn’t obviously in \(\mathbb {Z}\).

Definition 5.5
#

The infinite sum \(0!+1!+2!+3!+4!+5!+\cdots \) looks like it makes no sense at all; it is the sum of an infinite series of larger and larger positive numbers. However, the sum is finite modulo \(N\) for every positive integer \(N\), because all the terms from \(N!\) onwards are multiples of \(N\) and thus are zero in \(\mathbb {Z}/N\mathbb {Z}\). Thus it makes sense to define \(e_N\) to be the value of the finite sum modulo \(N\). Explicitly, \(e_N=0!+1!+\cdots +(N-1)!\) modulo \(N\).

Lemma 5.6
#

The collection \((e_N)_N\) is an element of \(\widehat{\mathbb {Z}}\).

Proof

This boils down to checking that \(D!+(D+1)!+\cdots +(N-1)!\) is a multiple of \(D\).

Lemma 5.7
#

The element \((e_N)_N\) of \(\widehat{\mathbb {Z}}\) is not in \(\mathbb {Z}\).

Proof

First imagine that \(e=n\) with \(n\in \mathbb {Z}\) and \(0\leq n\). In this case, choose \(j\) such that \(0!+1!+2!+\cdots +j!{\gt}n\) and check also that the sum is less than \((j+1)!\). Now set \(N=(j+1)!\) and let’s compare \(e_N\) and \(n_N=n\). The trick is that \(e_N\) must be \(0!+1!+\cdots +j!\) mod \(N\), because all the terms beyond this are multiples not just of \((j+1)\) but of \((j+1)!=N\). Thus mod \(N\) we have \(0\leq n{\lt}e_N{\lt}N\) so \(n\not=e\).

Now we deal with \(n=-t{\lt}0\); choose \(j\) large such that \((j+1)!-(0!+1!+\cdots +j!){\gt}t\) (possible because the sum is at most \(2\times j!\)) and then set \(N=(j+1)!\) and we have \(0 {\lt} e_N{\lt}N-t{\lt}N\) so we cannot have \(e_N=-t\) in \(\mathbb {Z}/N\mathbb {Z}\), so again \(e\not=n\).

Let’s prove some more basic facts about \(\widehat{\mathbb {Z}}\).

Lemma 5.8
#

If \(0{\lt}N\) is an integer then multiplication by \(N\) is injective on \(\widehat{\mathbb {Z}}\).

Proof

Suppose that \((z_i)_i\in \widehat{\mathbb {Z}}\) and \(Nz=0\). This means that \(Nz_i=0\in \mathbb {Z}/i\mathbb {Z}\) for all \(i\). Let us fix an arbitrary positive integer \(j\); we need to prove that \(z_j=0\in \mathbb {Z}/j\mathbb {Z}\). Consider the element \(z_{Nj}\in \mathbb {Z}/Nj\mathbb {Z}\). By assumption, we have \(Nz_{Nj}=0\), meaning that if we lift \(z_{Nj}\) to an integer, we have \(Nj\mid Nz_{Nj}\), and thus \(j\mid z_{Nj}\). Thus by the compatibility assumption on the \(z_i\) we have that \(z_j\in \mathbb {Z}/j\mathbb {Z}\) is the mod \(j\) reduction of \(z_{Nj}\) and hence is zero.

We will also need to understand exactly which elements of \(\widehat{\mathbb {Z}}\) are multiples of \(N\).

Lemma 5.9
#

The multiples of \(N\) in \(\widehat{\mathbb {Z}}\) are precisely the compatible collections \((z_i)_i\in \widehat{\mathbb {Z}}\) with \(z_N=0\).

Proof

Clearly \(z_N=0\) is a necessary condition to be a multiple of \(N\). To see it is sufficient, take a general \((z_i)\in \widehat{\mathbb {Z}}\) such that \(z_N=0\), and now define a new element \((y_j)_j\) of \(\widehat{\mathbb {Z}}\) by \(y_j=z_{Nj}/N\). Just to clarify what this means: \(z_{Nj}\in \mathbb {Z}/Nj\mathbb {Z}\) reduces mod \(N\) to \(z_N=0\) by the compatibility assumption, so it is in the subgroup \(N\mathbb {Z}/Nj\mathbb {Z}\) of \(\mathbb {Z}/Nj\mathbb {Z}\), which is isomorphic (via "division by \(N\)") to the group \(\mathbb {Z}/j\mathbb {Z}\); this is how we construct \(y_j\). It is easily checked that the \(y_j\) are compatible and that \(Ny=z\).

5.4 More advanced remarks on \(\widehat{\mathbb {Z}}\) versus \(\mathbb {Q}\)

This section can be skipped on first reading.

People who have seen some more advanced algebra might recognise this construction of \(\widehat{\mathbb {Z}}\) as being the profinite completion of the additive abelian group \(\mathbb {Z}\), so it is a fundamental object of mathematics in some sense. But usually, when building mathematics, after \(\mathbb {Z}\) we go to \(\mathbb {Q}\), a multiplicative localisation of \(\mathbb {Z}\), and only complete after that (to get \(\mathbb {R}\)). The process of “completing before localising” gives us a far more arithmetic completion of \(\mathbb {Z}\).

Even though \(\mathbb {Q}\) is a divisible abelian group and hence its profinite completion vanishes, we can still attempt to "locally profinitely complete it" by defining \(\widehat{\mathbb {Q}}:=\mathbb {Q}\otimes _{\mathbb {Z}}\widehat{\mathbb {Z}}\). This object is more commonly known as the finite adeles of \(\mathbb {Q}\). More generally if \(F\) is any number field then \(F\otimes _{\mathbb {Z}}\widehat{\mathbb {Z}}\) is the ring of finite adeles of \(F\). To get to the full ring of adeles of a number field \(F\) you need to take the product with the ring of infinite adeles of \(F\), which is \(F\otimes _{\mathbb {Q}}\mathbb {R}\): some kind of universal archimedean completion of \(F\). I don’t know a reference which develops the theory of adeles in this way, so this is what we shall do here.

5.5 \(\widehat{\mathbb {Q}}\) and tensor products.

The definition of \(\widehat{\mathbb {Q}}\) is easy if you know about tensor products of additive abelian groups.

Definition 5.10
#

The profinite completion \(\widehat{\mathbb {Q}}\) of \(\mathbb {Q}\) is the tensor product \(\mathbb {Q}\otimes _{\mathbb {Z}}\widehat{\mathbb {Z}}\), or \(\widehat{\mathbb {Q}}=\mathbb {Q}\otimes \widehat{\mathbb {Z}}\) for short.

5.6 A crash course in tensor products

We’ve defined \(\widehat{\mathbb {Q}}\) to be \(\mathbb {Q}\otimes \widehat{\mathbb {Z}}\). Whatever does this mean? Well just to orient yourself, if \(A\) and \(B\) are additive abelian groups, then \(A\otimes B\) is also an abelian group. And if \(A\) and \(B\) are commutative rings (as they are in our case), then \(A\otimes B\) is also a commutative ring.

Even if \(A\) and \(B\) are completely concrete commutative rings, their tensor product \(A\otimes B\) might be incomprehensible. For example \(\mathbb {C}\otimes \mathbb {C}\) is completely incomprehensible (note that we are tensoring over the integers). It is not like the product of groups or the disjoint union of two sets, where you have a completely explicit unambiguous formula for each element.

In this sense, the theory of tensor products is a bit like the theory of continuous functions. Humanity started off studying concrete polynomial equations such as \(x^2+1\) and then moved on to concrete analytic functions such as \(\log (x)\) and \(\sin (x)\), but eventually the abstract concept of a continuous function from the reals to the reals was born. There is no “formula” for a general continuous function, and continuous functions such as \(e^{-1/x^2}\) or \(|x|\) have no power series. Even if there were a formula for a specific continuous function of interest, it is not clear in general how to make sense of the claim that it’s the “best” formula. In other words, there is no "canonical form" for a general continuous function, and yet we prove things about them anyway. We shall adopt the same attitude for elements of \(A\otimes B\).

The first thing to know about the tensor product \(A\otimes B\) of two abelian groups \(A\) and \(B\) is a “constructor” for the type. In other words, how can we make elements \(A\otimes B\)? Well, it turns out that given elements \(a\in A\) and \(b\in B\), we can form the element \(a\otimes _t b\in A\otimes B\).

Example 5.11
#

Recall that the sum of all the factorials is an element \(e\in \widehat{\mathbb {Z}}\), and \(22/7\) is certainly a rational number, so we can make the element \(\frac{22}{7}\otimes _te\in \widehat{\mathbb {Q}}\).

This example is in the Lean code.

Elements of the form \(a\otimes _t b\in A\otimes B\) are known as pure tensors. In the literature, pure tensors are often written \(a\otimes b\), but we shall follow mathlib’s convention in reserving the \(\otimes \) symbol for groups like \(A \otimes B\), and adorning it with a \(t\) when using it on elements of the groups (or, as Lean calls them, terms, which explains the notation).

Addition of pure tensors obeys the “distributivity” rules \(a\otimes _t b_1+a\otimes _t b_2=a\otimes _t(b_1+b_2)\) and \(a_1\otimes _t b+a_2\otimes _t b=(a_1+a_2)\otimes _t b\), but there is no rule which simplifies a general sum \(a\otimes _t b + c\otimes _t d\) into a pure tensor. Indeed, in general it is not the case that every element of a tensor product \(A\otimes B\) is of the form \(a\otimes _t b\); there can be tensors which aren’t pure. However every element of \(A\otimes B\) is a finite sum of pure tensors, with the result that one can attempt to define additive maps from \(A\otimes B\) by saying what they do on pure tensors, and then extending linearly.

Another thing worth understanding is that just like how rational numbers can be written as quotients of integers in several ways (for example \(1/2=2/4=3/6=\cdots \)), a general pure tensor in \(A\otimes B\) can be represented as \(a\otimes _t b\) in many ways. For example, in \(\widehat{\mathbb {Q}}\) we have \(1\otimes _t 2=2\otimes _t 1\). A general rule for equality of pure tensors is that if \(a\in A\) and \(b\in B\) and \(z\in \mathbb {Z}\), then \(za\otimes _tb=a\otimes _tzb\); integers can move over the tensor symbol. But equality is hard: in general there may not be an algorithm to decide whether two pure tensors \(a\otimes _t b\) and \(c\otimes _t d\) are equal in \(A\otimes B\).

Remark 5.12
#

A summary of the situation: if \(A\) and \(B\) are abelian groups, then every element of \(A\otimes B\) can be written in the form \(\sum _{i=1}^Na_i\otimes _tb_i\). It’s just that this representation is highly nonunique, and furthermore given explicit elements \(a_1,a_2\in A\) and \(b_1,b_2\in B\) it might be a hard problem to figure out if \(a_1\otimes _t b_1=a_2\otimes _t b_2\).

For example, it turns out that \((\mathbb {Z}/2\mathbb {Z})\otimes (\mathbb {Z}/3\mathbb {Z})=0\) and so in this tensor product all the \(a\otimes _t b\) are equal to each other and to \(0\otimes 0\).

Having said all of that, one nice property of \(\widehat{\mathbb {Q}}\) is that every tensor is pure; let’s prove this now.

Lemma 5.13
#

Every element of \(\widehat{\mathbb {Q}}:=\mathbb {Q}\otimes \widehat{\mathbb {Z}}\) can be written as \(q\otimes _t z\) with \(q\in \mathbb {Q}\) and \(z\in \widehat{\mathbb {Z}}\). Furthermore one can even assume that \(q=\frac{1}{N}\) for some positive integer \(N\).

Proof

A proof I would write on the board would look like the following. Take a general element of \(\widehat{\mathbb {Q}}\); we know it can be expressed as a finite sum \(\sum _i q_i\otimes _t z_i\) with \(q_i\in \mathbb {Q}\) and \(z_i\in \widehat{\mathbb {Z}}\). Now choose a large positive integer \(N\), the lowest common multiple of all the denominators showing up in the \(q_i\), and then rewrite \(\sum _i q_i\otimes _t z_i\) as \(\sum _i \frac{n_i}{N}\otimes z_i\) with \(n_i\in \mathbb {Z}\). Now using the fundamental fact that \(na\otimes _t b=a\otimes _t nb\) for \(n\in \mathbb {Z}\), we can rewrite the sum as \(\sum _i \frac{1}{N}\otimes _t n_i z_i\) which is equal to the pure tensor \(\frac{1}{N}\otimes (\sum _i n_i z_i)\).

In Lean I would prove this using TensorProduct.induction_on, which quickly reduces us to the claim that the sum of two pure tensors is pure, which we can prove using the above technique whilst avoiding the general theory of finite sums.

Be careful though: just because every element of \(\widehat{\mathbb {Q}}\) can be written as \(q\otimes z\), this representation may not be unique. For example \(2\otimes 1=1\otimes 2\). However, writing \(\frac{1}{N}\otimes _t z\) as \(z/N\) does tempt us into the following definition.

Definition 5.14
#

If \(N\in \mathbb {N}^+\) and \(z\in \widehat{\mathbb {Z}}\) then we say that \(N\) and \(z\) are coprime if \(z_N\in (\mathbb {Z}/N\mathbb {Z})^\times \). We write \(z/N\) as notation for the element \(\frac{1}{N}\otimes _tz\).

Lemma 5.15
#

Every element of \(\widehat{\mathbb {Q}}\) can be uniquely written as \(z/N\) with \(z\in \widehat{\mathbb {Z}}\), \(N\in \mathbb {N}^+\), and with \(N\) and \(z\) coprime.

Proof

Existence: by the previous lemma, an arbitrary element can be written as \(z/N\); let \(D\) be the greatest common divisor of \(N\) and \(z_N\) (lifted to a natural). If \(D=1\) then the fraction is by definition in lowest terms. However if \(1{\lt}D\mid N\) then \(z_D\) is the reduction of \(z_N\) and is hence 0. By lemma 5.9 we deduce that \(z=Dy\) is a multiple of \(D\), and hence \(z/N=\frac{1}{N}\otimes _tDy=\frac{1}{E}\otimes y\), where \(E=N/D\). Now if a natural divided both \(y_E\) and \(E\) then this natural would divide both \(z_N/D\) and \(N/D\), contradicting the fact that \(D\) is the greatest common divisors.

Uniqueness: if \(z/N=w/M\), we deduce \(1\otimes _t Mz=1\otimes _t Nw\), and by injectivity of \(\widehat{\mathbb {Z}}\to \widehat{\mathbb {Q}}\) we deduce that \(Mz=Nw=y\). In particular, if \(L\) is the lowest common multiple of \(M\) and \(N\) then \(y_L\) is a multiple of both \(M\) and \(N\) and is hence zero, so \(y=Lx\) is a multiple of \(L\) by 5.9, and we deduce from torsionfreeness that \(z=(L/M)x\) and \(w=(L/N)x\). If some prime divided \(L/M\) then it would have to divide \(N\) which means that \(z\) is not in lowest terms; similarly if some prime divided \(L/N\) then \(w/M\) would not be in lowest terms. We deduce that \(L=M=N\) and hence \(z=w\) by torsionfreeness.

If \(A\) and \(B\) are additive abelian groups then \(A\otimes B\) is also an additive abelian group. However if \(A\) and \(B\) are commutative rings, then \(A\otimes B\) also inherits the structure of a commutative ring, with \(0=0\otimes _t 0\) and \(1=1\otimes _t 1\). Pure tensors multiply in the obvious way: the product of \(a_1\otimes _t b_1\) and \(a_2\otimes _t b_2\) is \(a_1a_2\otimes _t b_1b_2.\) There are ring homomorphisms \(A\to A\otimes B\) and \(B\to A\otimes B\) sending \(a\) to \(a\otimes _t 1\) and \(b\) to \(1\otimes _t b\). In general such maps are not injective, but in the case of \(\widehat{\mathbb {Q}}=\mathbb {Q}\otimes \widehat{\mathbb {Z}}\) both maps from \(\mathbb {Q}\) and \(\widehat{\mathbb {Z}}\) are inclusions.

Lemma 5.16
#

The ring homomorphism \(\mathbb {Q}\to \widehat{\mathbb {Q}}\) sending \(q\) to \(q\otimes _t 1\) is injective.

Proof

We have seen that the map from \(\mathbb {Z}\) to \(\widehat{\mathbb {Z}}\) is injective. Now \(\mathbb {Q}\) is a flat \(\mathbb {Z}\)-module, because it’s torsion-free, so tensoring up we deduce that the map from \(\mathbb {Q}=\mathbb {Q}\otimes \mathbb {Z}\) to \(\widehat{\mathbb {Q}}=\mathbb {Q}\otimes \widehat{\mathbb {Z}}\) is also injective. There is no doubt a more elementary proof of this fact.

Lemma 5.17
#

The ring homomorphism \(\widehat{\mathbb {Z}}\to \widehat{\mathbb {Q}}\) sending \(z\) to \(1\otimes _t z\) is injective.

Proof

The map from \(\mathbb {Z}\) to \(\mathbb {Q}\) is injective, and we have seen that \(\widehat{\mathbb {Z}}\) is a torsion-free and thus flat \(\mathbb {Z}\)-module, so the map from \(\widehat{\mathbb {Z}}\) to \(\widehat{\mathbb {Q}}\) is also injective.

We can thus identify \(\mathbb {Q}=\mathbb {Q}\otimes \mathbb {Z}\) and \(\widehat{\mathbb {Z}}=\mathbb {Z}\otimes \widehat{\mathbb {Z}}\) with subrings of \(\widehat{\mathbb {Q}}=\mathbb {Q}\otimes \widehat{\mathbb {Z}}\). Note that, being commutative rings, \(\mathbb {Q}\) and \(\widehat{\mathbb {Z}}\) both contain a copy of \(\mathbb {Z}\) as a subring, and the corresponding copies of \(\mathbb {Z}\) in \(\widehat{\mathbb {Q}}\) are equal; this is because \(1\otimes a=a\otimes 1\) for all \(a\in \mathbb {Z}\).

5.7 Additive structure of \(\widehat{\mathbb {Q}}\).

Here we forget the ring structure on everything, and analyse \(\widehat{\mathbb {Q}}\) as an additive abelian group, and in particular how the subgroups \(\mathbb {Z}\), \(\mathbb {Q}\) and \(\widehat{\mathbb {Z}}\) sit within it.

The two results we prove in this section are that \(\mathbb {Q}\cap \widehat{\mathbb {Z}}=\mathbb {Z}\) and that \(\mathbb {Q}+\widehat{\mathbb {Z}}=\widehat{\mathbb {Q}}\). Using lattice-theoretic notation we could write these results as \(\mathbb {Q}\sqcap \widehat{\mathbb {Z}}=\mathbb {Z}\) and \(\mathbb {Q}\sqcup \widehat{\mathbb {Z}}=\widehat{\mathbb {Q}}\).

Lemma 5.18
#

The intersection of \(\mathbb {Q}\) and \(\widehat{\mathbb {Z}}\) in \(\widehat{\mathbb {Q}}\) is \(\mathbb {Z}\).

Proof

Clearly \(\mathbb {Z}\subseteq \mathbb {Q}\cap \widehat{\mathbb {Z}}\). Now suppose that \(x\in \mathbb {Q}\cap \widehat{\mathbb {Z}}\). Because \(x\) is rational we can write it as \(\frac{A}{B}\otimes _t1\) for some fraction \(A/B\) in lowest terms, and hence \(x=A/B\) where now we regard \(A\in \widehat{\mathbb {Z}}\) and note that \(A/B\) is still in lowest terms. However \(x\in \widehat{\mathbb {Z}}\) implies that \(x=x/1\) is in lowest terms, so we deduce that \(B=1\) and thus \(x=A\in \mathbb {Z}\).

Lemma 5.19
#

The sum of \(\mathbb {Q}\) and \(\widehat{\mathbb {Z}}\) in \(\widehat{\mathbb {Q}}\) is \(\widehat{\mathbb {Q}}\). More precisely, every element of \(\widehat{\mathbb {Q}}\) can be written as \(q+z\) with \(q\in \mathbb {Q}\) and \(z\in \widehat{\mathbb {Z}}\), or more precisely as \(q\otimes _t 1+1\otimes _t z\).

Proof

Write \(x\in \widehat{\mathbb {Q}}\) as \(x=z/N\) in lowest terms. Lift \(z_N\) to an integer \(t\) and observe that \((z-t)_N=0\), hence \(z-t=Ny\) for some \(y\in \widehat{\mathbb {Z}}\). Now \(x=t/N+y\in \mathbb {Q}+\widehat{\mathbb {Z}}\).

5.8 Multiplicative structure of the units of \(\widehat{\mathbb {Q}}\).

We now forget the additive structure on the commutative ring \(\widehat{\mathbb {Q}}\) and consider the multiplicative structure of its group of units \(\widehat{\mathbb {Q}}^\times \) (which I couldn’t get into the section title). We have the obvious subgroups \(\mathbb {Q}^\times \), \(\mathbb {Z}^\times \) and \(\widehat{\mathbb {Z}}^\times \).

Lemma 5.20

The intersection of \(\mathbb {Q}^\times \) and \(\widehat{\mathbb {Z}}^\times \) in \(\widehat{\mathbb {Q}}^\times \) is \(\mathbb {Z}^\times \).

Proof

Clearly the intersection is contained within \(\mathbb {Q}\cap \widehat{\mathbb {Z}}=\mathbb {Z}\). If \(n\in \mathbb {Z}\) is in \(\widehat{\mathbb {Z}}^\times \) then \(n\not=0\) and its inverse \(1/n=\pm 1/|n|\) is in lowest terms but also in \(\widehat{\mathbb {Z}}\), and hence \(|n|=1\) by uniqueness of lowest term representation.

Lemma 5.21

The product of \(\mathbb {Q}^\times \) and \(\widehat{\mathbb {Z}}^\times \) in \(\widehat{\mathbb {Q}}^\times \) is all of \(\widehat{\mathbb {Q}}^\times \). More precisely, every element of \(\widehat{\mathbb {Q}}^\times \) can be written as \(qz\) with \(q\in \mathbb {Q}^\times \) and \(z\in \widehat{\mathbb {Z}}^\times \).

Note that by the previous lemma, this representation will be unique up to sign.

Proof

We already know that a general element of \(\widehat{\mathbb {Q}}^\times \) can be written as \(x/N\) with \(N\) positive, so this reduces us to proving that a general element \(x\in \widehat{\mathbb {Z}}\) which is invertible in \(\widehat{\mathbb {Q}}^\times \) can be written as \(qz\) with \(q\in \mathbb {Q}^\times \) and \(z\in \widehat{\mathbb {Z}}^\times \).

We know \(1/x\) can be written in lowest terms as \(y/M\), and multiplying up we deduce that \(xy=M\), and hence \(x\) divides a positive integer. If \(i:\mathbb {Z}\to \widehat{\mathbb {Z}}\) denotes the inclusion, then we’ve just seen that the preimage of the principal ideal \((x)\), namely, \(J:=i^{-1}(x\widehat{\mathbb {Z}})\) is nonzero, as it contains \(M\). Let \(g\in J\) be the smallest positive integer; it’s well-known that \(J=(g)\).

I claim that it suffices to show that \(x\widehat{\mathbb {Z}}=g\widehat{\mathbb {Z}}\). Because knowing \(g=yx\) and \(x=gz\) for some \(y,z\in \widehat{\mathbb {Z}}\) tells us that \(g(1-yz)=0\), and we know that multiplication by \(g\) is injective, hence \(yz=1\), so \(z\) is a unit and we have written \(x=gz\) with \(g\in \mathbb {Q}^\times \) and \(z\in \widehat{\mathbb {Z}}^\times \).

It remains to prove the claim. By definition \(g\in J\subseteq x\widehat{\mathbb {Z}}\) so this is one inclusion. For the other, it suffices to prove that \(x_g=0\). However if \(0{\lt}x_g{\lt}g\) lifts \(x_g\) to the naturals then I claim that \(x_g\in J\), for \(x_g-x\) is a multiple of \(g\) and hence of \(x\), and this contradicts minimality of \(g\).

We are nearly ready to embark upon the multiplicative adelic theory for quaternion algebras. However before we do this, we need to develop the theory of the Hurwitz quaternions a bit more formally.

5.9 The Hurwitz quaternions

Definition 5.22
#

The Hurwitz quaternions are the set \(\mathcal{O}:= \mathbb {Z}\oplus \mathbb {Z}\omega \oplus \mathbb {Z}i\oplus \mathbb {Z}i\omega \) (as an abstract abelian group or as a subgroup of the usual quaternions). Here \(\omega =\frac{-1+(i+j+k)}{2}\) and note that \((i+j+k)^2=-3\). We have \(\overline{\omega }=\omega ^2=-(\omega +1)\). A general quaternion \(a+bi+cj+dk\) is a Hurwitz quaternion if either \(a,b,c,d\in \mathbb {Z}\) or \(a,b,c,d\in \mathbb {Z}+\frac{1}{2}\).

Lemma 5.23
#

The Hurwitz quaternions form a ring.

Proof

Follow your nose.

This ring is isomorphic to \(\mathbb {Z}^4\) as an additive group, and \(\mathcal{O}\otimes _{\mathbb {Z}}\mathbb {R}=\mathbb {R}\oplus \mathbb {R}i\oplus \mathbb {R}j\oplus \mathbb {R}\omega \) is the usual Hamilton quaternions.

Definition 5.24
#

There’s a conjugation map (which we’ll call "star") from the Hurwitz quaternions to themselves, sending integers to themselves and purely imaginary elements like \(2\omega +1\) to minus themselves. It satisfies \((x^*)^*=x\), \((xy)^*=y^*x^*\) and \((x+y)^*=x^*+y^*\). In particular, the Hurwitz quaternions are a "star ring" in the sense of mathlib.

Definition 5.25
#

The Hurwitz quaternions come equipped with an integer-valued norm, which is \(a^2+b^2+c^2+d^2\) on \(a+bi+cj+dk\) but needs to be modified a bit to deal with \(\omega \).

Lemma 5.26
#

We have \(N(x)=x\overline{x}\).

Proof

Easy calculation.

Lemma 5.27
#

The norm of \(0\) is \(0\).

Proof

A calculation.

Lemma 5.28
#

The norm of \(1\) is \(1\).

Proof

A calculation.

Lemma 5.29
#

The norm of a product is the product of the norms.

Proof

A calculation.

Lemma 5.30
#

The norm of an element is nonnegative.

Proof

It’s a sum of rational squares.

Lemma 5.31
#

The norm of an element is zero if and only if the element is zero.

Proof

It’s a sum of rational squares.

Lemma 5.32
#

Given a “usual” quaternion \(a=x+yi+zj+wk\) with \(x,y,z,w\in \mathbb {R}\), there exists a Hurwitz quaternion \(q\) such that \(N(a-q){\lt}1\).

Proof

If \([r]\) denotes the nearest integer to the real number \(r\), then \(|r-[r]|\leq \frac{1}{2}\). Hence if \(q=[x]+[y]i+[z]j+[w]k\) then \(N(a-q)=|x-[x]|^2+\cdots \leq \frac{1}{4}+\frac{1}{4}+\frac{1}{4}+\frac{1}{4}\leq 1\), with strict inequality unless \(|x-[x]|=|y-[y]|=|z-[z]|=|w-[w]|=\frac{1}{2}\), in which case \(a\in \mathcal{O}\) because \(a-\omega \) has integer coordinates.

Lemma 5.33
#

Given two Hurwitz quaternions \(a\) and \(b\) with \(b\) nonzero, there exists \(q\) and \(r\) such that \(a=qb+r\) and \(N(r){\lt}N(b)\).

Proof

Let \(q\) be the Hurwitz quaternion obtained by applying Lemma 5.32 to \(a/b := ab^{-1}\); then \(N(a/b-q){\lt}1\) and now everything follows after multiplying up.

Corollary 5.34
#

All left ideals of \(\mathcal{O}\) are principal.

Proof

If the ideal is 0, use 0. Otherwise, choose a nonzero element of smallest norm.

Remark 5.35
#

All right ideals are principal too, because there’s another version of Euclid saying \(a=bq+r\).

5.10 Profinite completion of the Hurwitz quaternions

We define \(\widehat{\mathcal{O}}\) to be \(\mathcal{O}\otimes \widehat{\mathbb {Z}}\), so it’s elements \(a+bi+cj+d\omega \) with \(a,b,c,d\in \widehat{\mathbb {Z}}\). The basic thing we need is this:

Theorem 5.36

If \(N\) is a positive natural then the obvious map \(\mathcal{O}\to \widehat{\mathcal{O}}/N\widehat{\mathcal{O}}\) is surjective.

Proof

This is just four copies of the surjection \(\mathbb {Z}\to \widehat{\mathbb {Z}}/N\widehat{\mathbb {Z}}\). Note that this latter map is surjective because \(\mathbb {Z}\to \mathbb {Z}/N\mathbb {Z}\) is surjective, hence given \(z\in \widehat{\mathbb {Z}}\) you can subtract an integer \(w\) such that \((z-w)_N=0\), so \(z-w\) is a multiple of \(N\).

We define \(D:=\mathbb {Q}\otimes \mathcal{O}=\mathbb {Q}\oplus \mathbb {Q}i\oplus \mathbb {Q}j\oplus \mathbb {Q}\omega =\mathbb {Q}\oplus \mathbb {Q}i\oplus \mathbb {Q}j\oplus \mathbb {Q}k\). Finally, we define \(\widehat{D}:=D\otimes \widehat{\mathbb {Z}}\). Just as with \(\widehat{\mathbb {Q}}\) we have

Lemma 5.37

Every element of \(\widehat{D}\) can be written as \(z/N\) with \(z\in \widehat{\mathcal{O}}\) and \(N\in \mathbb {N}^+\).

Proof

Same as the proof for \(\widehat{\mathbb {Q}}\).

It is not hard to check that \(\widehat{D}\) contains \(\widehat{\mathcal{O}}\) and \(D\) as subrings, and that as additive abelian groups we have \(\widehat{\mathcal{O}}\cap D=\mathcal{O}\) and \(\widehat{\mathcal{O}}+D=\widehat{D}\). This is because \(\mathcal{O}\) is just four copies of \(\mathbb {Z}\) and we’ve proved the analogous result for \(\mathbb {Z}\).

However the multiplicative structure is more interesting, especially as \(D\) is not commutative. For a general quaternion algebra it is not true that \((\widehat{D})^\times =D^\times (\widehat{\mathcal{O}})^\times \), because there are "class group obstructions". The double coset space is some kind of non-commutative analogue of a class group. However for our particular choice of \(D\) and \(\mathcal{O}\) the result is true.

Theorem 5.38
#

The group of units of \(\widehat{D}\) is \(D^\times \widehat{\mathcal{O}}^\times \). More precisely, every element of \(\widehat{D}^\times \) can be written as a product \(\delta u\) with \(\delta \in D^\times \) and \(u\in \widehat{\mathcal{O}}^\times \).

Proof

Given an element \(x\) of \(\widehat{D}^\times \), we can use lemma 5.37 to write it as \(z/N\) with \(N\) a positive integer and \(z\in \widehat{\mathcal{O}}\). Note that \(N\) is central and in \(D^\times \). Similarly, we can write \(x^{-1}\) as \(y/M\) with \(M\) a positive integer and \(y\in \widehat{\mathcal{O}}\). Then \(1=xx^{-1}=zy/NM\) and so \(zy=NM=MN\), and \(1=x^{-1}x=yz/MN\) so \(yz=MN\) too. In particular \(y\) both left and right divides a positive integer.

Now consider the left ideal \(\widehat{\mathcal{O}}y\) generated by \(y\). We’ve just seen that this ideal has nontrivial intersection with \(\mathcal{O}\), because it contains \(MN{\gt}0\). Hence its intersection with \(\mathcal{O}\) is a nonzero left ideal of \(\mathcal{O}\), which is hence principal by corollary 5.34. Write it as \(\mathcal{O}\alpha \) with \(0\not=\alpha \in \mathcal{O}\).

It suffices to show that \(\widehat{\mathcal{O}}\alpha =\widehat{\mathcal{O}}y\). For this would imply that \(u\alpha =y\) and \(vy=\alpha \) for some \(u,v\in \widehat{\mathcal{O}}\) and thus \((vu-1)\alpha =0\) and \((uv-1)y=0\), and both \(\alpha \) and \(y\) are left divisors of positive integers (the norm of \(\alpha \), and \(MN\) respectively), so now using the fact that \(\widehat{\mathcal{O}}\) is \(\mathbb {Z}\)-torsion-free (is the tensor product of torsion-free abelian groups torsion-free? That would be a cheap way of doing it. Otherwise use \(\mathcal{O}=\mathbb {Z}^4\)) we deduce that \(u\) and \(v\) are units, and thus \(x^{-1}=\frac{1}{M}u\alpha \) so \(x=(M\alpha ^{-1})v\in D^\times \widehat{\mathcal{O}}^\times \).

What remains is this. We have \(y\in \widehat{\mathcal{O}}\) which left and right divides some positive integer. We’ve defined \(0\not=\alpha \in \mathcal{O}\) such that \(\mathcal{O}\alpha \) is the pullback of the abelian group \(\widehat{\mathcal{O}}y\) along the map \(\mathcal{O}\to \widehat{\mathcal{O}}\). We need to show that when we push this ideal \(\mathcal{O}\alpha \) forwards to \(\widehat{\mathcal{O}}\) we get \(\widehat{\mathcal{O}}y\) again. The fact that \(\widehat{\mathcal{O}}\alpha \subseteq \widehat{\mathcal{O}}y\) is easy, because \(\alpha \in \widehat{\mathcal{O}}y\) by definition. So it remains to show that \(y\in \widehat{\mathcal{O}}\alpha \).

Let’s define \(T\) to be a positive integer which is both a left and right multiple of both \(y\) and \(\alpha \) (for example \(T=MN\alpha \overline{\alpha }\) will do). Now note that we have an isomorphism \(\mathcal{O}/T\mathcal{O}=\widehat{\mathcal{O}}/T\widehat{\mathcal{O}}\), so we can choose some \(\beta \in \mathcal{O}\) such that \(\beta -y\in T\widehat{\mathcal{O}}\) is a multiple of \(T\). Next note that \(\beta \in y+\widehat{\mathcal{O}}T\subset \widehat{\mathcal{O}}y\) is in \(\widehat{\mathcal{O}}y\cap \mathcal{O}=\mathcal{O}\alpha \), meaning \(\beta =\gamma \alpha \) for some \(\gamma \in \mathcal{O}\). Hence \(y\in \beta +\widehat{\mathcal{O}}T\subseteq \widehat{\mathcal{O}}\alpha \).