A Blueprint for Fermat’s Last Theorem

2 First reductions of the problem

2.1 Overview

The proof of Fermat’s Last Theorem is by contradiction. We assume that we have a counterexample \(a^n+b^n=c^n\), and manipulate it until it satisfies the axioms of a “Frey package”. From the Frey package we build a Frey curve – an elliptic curve defined over the rationals. We then look at a certain representation of a Galois group coming from this elliptic curve, and finally using two very deep and independent theorems (one due to Mazur, the other due to Wiles) we show that this representation is neither reducible or irreducible, a contradiction.

2.2 Reduction to \(n\geq 5\) and prime

Lemma 2.1
#

If there is a counterexample to Fermat’s Last Theorem, then there is a counterexample \(a^p+b^p=c^p\) with \(p\) an odd prime.

Proof

Note: this proof is in mathlib already; we run through it for completeness’ sake.

Say \(a^n + b^n = c^n\) is a counterexample to Fermat’s Last Theorem. Every positive integer is either a power of 2 or has an odd prime factor. If \(n=kp\) has an odd prime factor \(p\) then \((a^k)^p+(b^k)^p=(c^k)^p\) is the counterexample we seek. It remains to deal with the case where \(n\) is a power of 2, so let’s assume this. We have \(3\leq n\) by assumption, so \(n=4k\) must be a multiple of 4, and thus \((a^k)^4=(b^k)^4=(c^k)^4\), giving us a counterexample to Fermat’s Last Theorem for \(n=4\). However an old result of Fermat himself (proved as fermatLastTheoremFour in mathlib) says that \(x^4+y^4=z^4\) has no nontrivial solutions.

Euler proved Fermat’s Last Theorem for \(p=3\); at the time of writing this is not in mathlib.

Lemma 2.2
# Discussion

There are no solutions in positive integers to \(a^3+b^3=c^3\).

Proof

A proof has been formalised in Lean in the FLT-regular project here. Another proof has been formalised in Lean in the FLT3 project here by a team from the Lean For the Curious Mathematician conference held in Luminy in March 2024 (its dependency graph can be visualised here). To get this node green, this latter proof needs to be upstreamed to mathlib. This is currently work in progress by the same team.

Corollary 2.3
#

If there is a counterexample to Fermat’s Last Theorem, then there is a counterexample \(a^p+b^p=c^p\) with \(p\) prime and \(p\geq 5\).

Proof

Follows from the previous two lemmas.

2.3 Frey packages

For convenience we make the following definition.

Definition 2.4
#

A Frey package \((a,b,c,p)\) is three pairwise-coprime nonzero integers \(a\), \(b\), \(c\), with \(a\equiv 3\pmod4\) and \(b\equiv 0\pmod2\), and a prime \(p\geq 5\), such that \(a^p+b^p=c^p\).

Our next reduction is as follows:

Lemma 2.5

If Fermat’s Last Theorem is false for \(p \ge 5\) and prime, then there exists a Frey package.

Proof

Suppose we have a counterexample \(a^p+b^p=c^p\) for the given \(p\); we now build a Frey package from this data.

If the greatest common divisor of \(a,b,c\) is \(d\) then \(a^p+b^p=c^p\) implies \((a/d)^p+(b/d)^p=(c/d)^p\). Dividing through, we can thus assume that no prime divides all of \(a,b,c\). Under this assumption we must have that \(a,b,c\) are pairwise coprime, as if some prime divides two of the integers \(a,b,c\) then by \(a^p+b^p=c^p\) and unique factorization it must divide all three of them. In particular we may assume that not all of \(a,b,c\) are even, and now reducing modulo 2 shows that precisely one of them must be even.

Next we show that we can find a counterexample with \(b\) even. If \(a\) is the even one then we can just switch \(a\) and \(b\). If \(c\) is the even one then we can replace \(c\) by \(-b\) and \(b\) by \(-c\) (using that \(p\) is odd).

The last thing to ensure is that \(a\) is 3 mod 4. Because \(b\) is even, we know that \(a\) is odd, so it is either 1 or 3 mod 4. If \(a\) is 3 mod 4 then we are home; if however \(a\) is 1 mod 4 we replace \(a,b,c\) by their negatives and this is the Frey package we seek.

2.4 Galois representations and elliptic curves

To continue, we need some of the theory of elliptic curves over \(\mathbb {Q}\). So let \(f(X)\) denote any monic cubic polynomial with rational coefficients and whose three complex roots are distinct, and let us consider the equation \(E:Y^2=f(X)\), which defines a curve in the \((X,Y)\) plane. This curve (or strictly speaking its projectivisation) is a so-called elliptic curve (or an elliptic curve over \(\mathbb {Q}\) if we want to keep track of the field where the coefficients of \(f(X)\) lie). More generally if \(k\) is any field then there is a concept of an elliptic curve over \(k\), again defined by a (slightly more general) plane cubic curve \(F(X,Y)=0\).

If \(E\) is an elliptic curve over a field \(k\), and if \(K\) is any field which is a \(k\)-algebra, then we write \(E(K)\) for the set of solutions to \(y^2=f(x)\) with \(x,y\in K\), together with an additional “point at infinity”. It is an extraordinary fact, and not at all obvious, that \(E(K)\) naturally has the structure of an additive abelian group, with the point at infinity being the zero element (the identity). Fortunately this fact is already in mathlib. We shall use \(+\) to denote the group law. This group structure has the property that three distinct points \(P,Q,R\in K^2\) which are in \(E(K)\) will sum to zero if and only if they are collinear.

The group structure behaves well under change of field.

Lemma 2.6
#

If \(E\) is an elliptic curve over a field \(k\), and if \(K\) and \(L\) are two fields which are \(k\)-algebras, and if \(f:K\to L\) is a \(k\)-algebra homomorphism, the map from \(E(K)\) to \(E(L)\) induced by \(f\) is an additive group homomorphism.

Proof

The equations defining the group law are ratios of polynomials with coefficients in \(k\), and such things behave well under \(k\)-algebra homomorphisms.

This construction is functorial (it sends the identity to the identity, and compositions to compositions).

Lemma 2.7
#

The group homomorphism \(E(K)\to E(K)\) induced by the identity map \(K\to K\) is the identity group homomorphism.

Proof

An easy calculation.

Lemma 2.8
#

If \(K\to L\to M\) are \(k\)-algebra homomorphisms then the group homomorphism \(E(K)\to E(M)\) induced by the map \(K\to M\) is the composite of the map \(E(K)\to E(L)\) induced by \(K\to L\) and the map \(E(L)\to E(M)\) induced by the map \(L\to M\).

Proof

Another easy calculation.

Thus if \(f:K\to L\) is an isomorphism of fields, the induced map \(E(K)\to E(L)\) is an isomorphism of groups, with the inverse isomorphism being the map \(E(L)\to E(K)\) induced by \(f^{-1}\). This construction thus gives us an action of the multiplicative group \(\operatorname{Aut}_k(K)\) of automorphisms of the field \(K\) on the additive abelian group \(E(K)\).

Definition 2.9
#

If \(E\) is an elliptic curve over a field \(k\) and \(K\) is a field and a \(k\)-algebra, then the group of \(k\)-automorphisms of \(K\) acts on the additive abelian group \(E(K)\).

In particular, if \(\overline{\mathbb {Q}}\) denotes an algebraic closure of the rationals (for example, the algebraic numbers in \(\mathbb {C}\)) and if \(\operatorname{Gal}(\overline{\mathbb {Q}}/\mathbb {Q})\) denotes the group of field isomorphisms \(\overline{\mathbb {Q}}\to \overline{\mathbb {Q}}\), then for any elliptic curve \(E\) over \(\mathbb {Q}\) we have an action of \(\operatorname{Gal}(\overline{\mathbb {Q}}/\mathbb {Q})\) on the additive abelian group \(E(\overline{\mathbb {Q}})\).

We need a variant of this construction where we only consider the \(n\)-torsion of the curve, for \(n\) a positive integer. Recall that if \(A\) is any additive abelian group, and if \(n\) is a positive integer, then we can consider the subgroup \(A[n]\) of elements \(a\) such that \(na=0\). If a group \(G\) acts on \(A\) via additive group isomorphisms, then there will be an induced action of \(G\) on \(A[n]\).

Definition 2.10

If \(E\) is an elliptic curve over a field \(k\) and \(K\) is a field and a \(k\)-algebra, and if \(n\) is a natural number, then the group of \(k\)-automorphisms of \(K\) acts on the additive abelian group \(E(K)[n]\) of \(n\)-torsion points on the curve.

If furthermore \(n=p\) is prime, then \(A[p]\) is naturally a vector space over the field \(\mathbb {Z}/p\mathbb {Z}\), and thus it inherits the structure of a mod \(p\) representation of \(G\). Applying this to the above situation, we deduce that if \(E\) is an elliptic curve over \(\mathbb {Q}\) then \(\operatorname{Gal}(\overline{\mathbb {Q}}/\mathbb {Q})\) acts on \(E(\overline{\mathbb {Q}})[p]\) and this is the mod \(p\) Galois representation attached to the curve \(E\).

In the next section we apply this theory to an elliptic curve coming from a counterexample to Fermat’s Last theorem.

2.5 The Frey curve

Definition 2.11 Frey
# Discussion

Given a Frey package \((a,b,c,p)\), the corresponding Frey curve (considered by Frey and, before him, Hellegouarch) is the elliptic curve \(E\) defined by the equation \(Y^2=X(X-a^p)(X+b^p).\)

Note that the roots of the cubic \(X(X-a^p)(X+b^p)\) are distinct because \(a,b,c\) are nonzero and \(a^p+b^p=c^p\).

Given a Frey package \((a,b,c,p)\) with corresponding Frey curve \(E\), the mod \(p\) Galois representation associated to this package is the representation of \(\operatorname{Gal}(\overline{\mathbb {Q}}/\mathbb {Q})\) on \(E(\overline{\mathbb {Q}})[p].\) Frey’s observation is that this mod \(p\) Galois representation has some very surprising properties. We will make this remark more explicit in the next chapter. Here we shall show how these properties can be used to finish the job.

2.6 Reduction to two big theorems.

Recall that a representation of a group \(G\) on a vector space \(W\) is said to be irreducible if there are precisely two \(G\)-stable subspaces of \(W\), namely \(0\) and \(W\). The representation is said to be reducible otherwise.

Now say Fermat’s Last Theorem is false, and hence by Lemma 2.5 a Frey package \((a,b,c,p)\) exists. Consider the mod \(p\) representation of \(\operatorname{Gal}(\overline{\mathbb {Q}}/\mathbb {Q})\) coming from the \(p\)-torsion in the Frey curve \(Y^2=X(x-a^p)(X+b^p)\) associated to the package. Let’s call this representation \(\rho \). Is it reducible or irreducible?

Theorem 2.12 Mazur
#

\(\rho \) cannot be reducible.

Proof

This follows from a profound result of Mazur [ 9 ] from 1979, namely the fact that the torsion subgroup of an elliptic curve over \(\mathbb {Q}\) can have size at most 16. In fact a fair amount of work still needs to be done to deduce the theorem from Mazur’s result. We will have more to say about this result later.

Theorem 2.13 Wiles,Taylor–Wiles, Ribet,…

\(\rho \) cannot be irreducible either.

Proof

This is the main content of Wiles’ magnum opus. We omit the argument for now, although later on in this project we will have a lot to say about a proof of this.

Corollary 2.14
#

There is no Frey package.

Proof

Follows immediately from the previous two theorems 2.12 and 2.13.

We deduce

Corollary 2.15
#

Fermat’s Last Theorem is true.

Proof

Assume there is a there is a counterexample \(a^n+b^n=c^n\). By Corollary 2.3 we may assume that there is also a counterexample \(a^p+b^p=c^p\) with \(p\geq 5\) and prime. Then there is a Frey package \((a,b,c,p)\) by 2.5, contradicting Corollary 2.14.

The structure of the rest of this (highly incomplete, for now) document is as follows. In Chapter 3 we develop some of the basic theory of elliptic curves and the Galois representations attached to their \(p\)-torsion subgroups. We then apply this theory to the Frey curve, deducing in particular how Mazur’s result on torsion subgroups of elliptic curves implies Theorem 2.12, the assertion that \(\rho \) cannot be reducible. In Chapter 4 we give a high-level overview of our strategy to prove that \(\rho \) cannot be irreducible, which diverges from the original approach taken by Wiles; one key difference is that we work with the \(p\)-torsion directly rather than switching to the 3-torsion. We also give a precise statement of the modularity lifting theorem which we will use. Finally, in Chapter 8 we give a collection of theorem statements which we shall need in order to push our strategy through. All of these results were known in the 1980s or before. This chapter is incoherent in the sense that it is just a big list of apparently unrelated results. As our exposition of the proof expands, the results of this chapter will slowly move to more appropriate places. The chapter is merely there to give some kind of idea of the magnitude of the project.