8 Miniproject: Frobenius elements
8.1 Status
This miniproject has been a success: the main results are sorry-free and merged into mathlib (see this PR). As a result there will be no more work on this miniproject in the FLT repo. Below is a fairly detailed sketch of the argument used.
8.2 Introduction and goal
When this project started, I had thought that the existence of Frobenius elements was specific to the theory of local and global fields, and a slightly more general result held for Dedekind domains. Then Joel Riou pointed out on the Lean Zulip an extremely general result from from Bourbaki’s Commutative Algebra (Chapter V, Section 2, number 2, theorem 2, part (ii)). This beautiful result is surely what we want to see in mathlib. Before we state Bourbaki’s theorem, let us set the scene.
8.3 Statement of the theorem
The set-up throughout this project: \(G\) is a finite group acting (via ring isomorphisms) on a commutative ring \(B\), and \(A\) is the subring of \(G\)-invariants.
Say \(Q\) is a prime ideal of \(B\), whose pullback to \(A\) is the prime ideal \(P\). Note that \(G\) naturally acts on the ideals of \(B\). Let’s define the decomposition group \(D_Q\) of \(Q\) to be the subgroup of \(G\) which stabilizes \(Q\) (just to be clear: \(g\in D_Q\) means \(\{ g\cdot q\, :\, q \in Q\} =Q\), not \(\forall q\in Q, g\cdot q=q\)).
Let \(L\) be the field of fractions of the integral domain \(B/Q\), and let \(K\) be the field of fractions of the subring \(A/P\). Then \(L\) is naturally a \(K\)-algebra. In this generality \(L/K\) may not even be finite or Galois, but we can still talk about \(\operatorname{Aut}(L/K)\).
In the next definition we write down a group homomorphism \(\phi \) from \(D_Q\) to \(\operatorname{Aut}(L/K)\).
Choose \(g\in D_Q\). Then the action of \(g\) on \(B\) gives us an induced \(A/P\)-algebra automorphism of \(B/Q\) which extends to a \(K\)-algebra automorphism \(\phi (g)\) of \(L\). This construction \(g\mapsto \phi (g)\) defines a group homomorphism from \(D_Q\) to \(\operatorname{Aut}(L/K)\) (all the proofs implicit in the definition here are straightforward).
The theorem we want in this mini-project is
The map \(g\mapsto \phi _g\) from \(D_Q\) to \(\operatorname{Aut}(L/K)\) defined above is surjective.
The goal of this mini-project is to get this theorem formalised and ideally into mathlib.
In particular, \(\operatorname{Aut}(L/K)\) is finite as a corollary. What is so striking about this theorem to me is that the only finiteness hypothesis is on the group \(G\) which acts; there are no finiteness or Noetherian hypotheses on the rings at all.
As a trivial consequence we get Frobenius elements for finite Galois extensions in both the local and global field setting, as \(\operatorname{Aut}(L/K)\) is just a Galois group of finite fields in these cases, so by surjectivity we can lift a Frobenius element.
Even though \(G\) is finite, it is possible in characteristic \(p{\gt}0\) for the extension \(L/K\) to be infinite (and mostly inseparable). The theorem implies that \(\operatorname{Aut}(L/K)\) is always finite; what is actually happening is that \(L/K\) is algebraic and normal, and its maximal separable subextension is finite of degree at most \(|G|\). However, we can prove surjectivity directly without reference to this maximal separable subextension.
8.3.1 Examples
These do not need to be formalised.
The basic example is when \(B\) is the ring of integers of a finite Galois extension of \(\mathbb {Q}\) (or equivalently the field obtained by adjoining all of the roots of a monic polynomial with rational coefficients). Then the Galois group \(G\) acts on \(B\), and the invariants \(A\) are just \(B\cap \mathbb {Q}=\mathbb {Z}\). If \(Q\) is a nonzero prime ideal of \(B\) then a standard result is that \(B/Q\) is a finite field. Let’s call this field \(k\). We deduce that \(P=Q\cap \mathbb {Z}\) is a nonzero ideal (as \(\mathbb {Z}/P\) injects into \(B/Q\)) and hence must be the principal ideal generated by a prime number \(p\). This number \(p\) is “the prime below \(Q\)”, and also the characteristic of \(k\).
If \(D\) is the subgroup of \(G\) stabilizing \(Q\) as a set, then \(D\) acts on \(k\), so we get a map from \(D\) to \(\operatorname{Gal}(k/\mathbb {F}_p)\). The theorem states that this map is surjective. Its kernel is the inertia subgroup of \(Q\), which for all but finitely many primes of \(B\) is the trivial group. So in these cases we get an isomorphism from \(D\) to \(\operatorname{Gal}(k/\mathbb {F}_p)\) meaning that \(D\) is cyclic, and furthermore has two canonical generators, one called \(\operatorname{Frob}_Q\) (by Artin) and the other one unfortunately also called \(\operatorname{Frob}_Q\) (by Deligne), which are inverses to each other. Some people think that both Frobenii are canonical, some people (for example the Shimura variety crowd) think that Deligne’s choice is more canonical, and others (for example the Heegner point crowd) think that Artin’s choice is more canonical. It was this example which convinced me that the word “canonical” should be banished from mathematics if not accompanied by a clear explanation of what the author means by the word. I know several examples of papers in the literature where \(\operatorname{Frob}_Q\) is used with no explanation as to which one it is. And sometimes it doesn’t matter. For example, with either choice of normalization the following is true. If \(Q\) and \(Q'\) are two primes above \(p\) then there’s some \(g\in G\) such that \(gQ=Q'\) and one can deduce from this that \(\operatorname{Frob}_Q\) and \(\operatorname{Frob}_{Q'}\) are conjugate. In particular if \(G\) is abelian then \(\operatorname{Frob}_Q\) and \(\operatorname{Frob}_{Q'}\) are equal, so we can call them both \(\operatorname{Frob}_p\).
An example which demonstrates that things can get a bit stranger in characteristic \(p\) is the following (we restrict to \(p=2\) but a generalisation of this pathology exists for all \(p{\gt}0\)). We let \(B=\mathbb {F}_2[X,Y]\) and \(G=\{ 1,\tau \} \) with the involution \(\tau \) fixing \(Y\) and switching \(X\), \(X+Y\), i.e., \((\tau f)(X,Y)=f(X+Y, Y)\). Hence \(\tau \) fixes the sum and product of these two polynomials, and thus \(A\supseteq \mathbb {F}_2[X^2+XY,Y]\). One can check (and this needs checking) that this is in fact an equality, and I believe that \(B\) is free of rank 2 over \(A\) with basis \(\{ 1,X\} \). Now set \(Q=(Y)\) so \(P=YA\) and the quotient \((B/Q)/(A/P)\) is \(\mathbb {F}_2[X]/\mathbb {F}_2[X^2]\), and the corresponding extension of the fields of fractions of these integral domains is a radical extension \(\mathbb {F}_2(X)/\mathbb {F}_2(X^2)\) of degree 2. In other words, it is finite and normal, but not separable.
It appears that this construction behaves well with respect to tensor products over \(\mathbb {F}_2\) and filtered colimits (I have not thought about how to make this rigorous), and assuming this is true, it will explain Bourbaki’s counterexample to the hope that \(L/K\) is always finite. We now describe the counterexample (exercise 9 of number 2)
The example is from the exercises in Bourbaki (exercise 9 of section 2 above, found at the end of Chapter V). We let \(B=\mathbb {F}_2[X_0,Y_0,X_1,Y_1,X_2,\ldots ]\) be a polynomial ring in infinitely many variables \(X_i\) and \(Y_i\) indexed by the naturals (or indeed by any infinite set), and \(G\) is cyclic of order 2 with the generator acting on \(B\) via \(X_n\mapsto X_n+Y_n\) and \(Y_n\mapsto Y_n\). If \(Q\) is the ideal generated by the \(Y_n\) then now \(L/K\) is a radical extension of infinite degree; the generator swaps \(X_n\) and \(X_n+Y_n\), so it fixes their product \(a_n\in A\), which becomes \(X_n^2\) modulo \(Q\), so all of the \(X_{i}\) will be algebraically independent in \(L/K\) and \(X_{i}^2\in K\).
8.4 The extension \(B/A\).
The precise set-up we’ll work in is the following. We fix \(G\) a finite group acting on \(B\) a commutative ring, and we have another commutative ring \(A\) such that \(B\) is an \(A\)-algebra and the image of \(A\) in \(B\) is precisely the \(G\)-invariant elements of \(B\). We don’t ever need the map \(A\to B\) to be injective so we don’t assume this.
We start with a construction which is fundamental to everything, and which explains why we need \(G\) to be finite.
If \(b\in B\) then define the characteristic polynomial \(F_b(X) \in B[X]\) of \(b\) to be \(\prod _{g\in G}(X-g\cdot b)\).
Clearly \(F_b\) is a monic polynomial.
\(F_b\) is monic.
Obvious.
It’s also clear that \(F_b\) has degree \(|G|\) and has \(b\) as a root. Also \(F_b\) is \(G\)-invariant, because acting by some \(\gamma \in G\) just permutes the order of the factors. Hence we can descend \(F_b\) to a monic polynomial \(M_b(X)\) of degree \(|G|\) in \(A[X]\). We will also refer to \(M_b\) as the characteristic polynomial of \(b\), even though it’s not even well-defined if the map \(A\to B\) isn’t injective.
\(F_b\) is the lift of some monic polynomial \(M_b\) in \(A[X]\).
The coefficients of \(F_b\) are \(G\)-invariant, and thus lie in the image of \(A\).
\(B/A\) is integral.
Use \(M_b\).
8.5 The extension \((B/Q)/(A/P)\).
Note that \(P\) and \(Q\) are primes, so the quotients \(A/P\) and \(B/Q\) are integral domains.
The following technical lemma constructs an element of \(B\) with nice characteristic polynomial modulo \(Q\).
There exist elements \(a,b \in B\), with \(a \notin Q\) and \(a\) in the image of \(A\) such that for all \(g\in G\),
If \(g \cdot Q = Q\), then \(g \cdot b \equiv a \pmod{Q}\).
If \(g \cdot Q \neq Q\), then \(g \cdot b \equiv 0 \pmod{Q}\).
The ideals \(g \cdot Q \neq Q\) are not contained in \(Q\). Since \(Q\) is a prime ideal, this implies that the intersection of all \(g \cdot Q \neq Q\) is still not contained in \(Q\). Then we can find an element \(c \notin Q\) with \(c \in g \cdot Q\) for all \(g \cdot Q \neq Q\). Let \(F_c\) be the characteristic polynomial of \(c\), and write \(F_c(X) \equiv X^j R(X) \pmod{Q}\). Let \(a\) be the coefficient of \(X^j\) in \(F_c(X)\), and choose \(R(X)\) so that \(R(0) = a\). Let \(b = R(0) - R(c)\). Note that \(F_c(c) = 0\) and \(c \not\equiv 0 \pmod{Q}\), so \(R(c) \equiv 0 \pmod{Q}\). Then \(b \equiv a \pmod{Q}\), so \(g \cdot b \equiv a \pmod{Q}\) whenever \(g \cdot Q = Q\). But if \(g \cdot Q \neq Q\), then \(c \equiv 0 \pmod{g \cdot Q}\). Then \(b \equiv 0 \pmod{g \cdot Q}\), so \(g \cdot b \equiv 0 \pmod{Q}\) whenever \(g \cdot Q \neq Q\).
A slight modification allows us to take an element of \(B\) fixed by \(D_Q\) as input.
Let \(b_0 \in B\). Suppose that the image of \(b_0\) in the quotient \(B/Q\) is fixed by the stabilizer subgroup \(D_Q\). Then there exist elements \(a,b \in B\), with \(a\notin Q\) and \(a\) in the image of \(A\) such that for all \(g\in G\),
If \(g \cdot Q = Q\), then \(g \cdot b \equiv ab_0 \pmod{Q}\).
If \(g \cdot Q \neq Q\), then \(g \cdot b \equiv 0 \pmod{Q}\).
Multiply the \(b\) from 8.7 by \(b_0\).
8.6 The extension \(L/K\).
Let \(L^{D_Q}\) denote the fixed field of the action \(D_Q\) on \(L\). Our strategy for proving surjectivity of \(D_Q \to \operatorname{Aut}(L/K)\) will be to write this map as the composition \(D_Q \to \operatorname{Aut}(L/L^{D_Q}) \to \operatorname{Aut}(L/K)\).
The surjectivity of the first map is a general fact of Galois theory.
Let \(H\) be a finite group acting on a field \(F\) by field automorphisms. Then the map \(H \to \operatorname{Aut}(F/F^H)\) is surjective.
This is a general fact of Galois theory and was already in mathlib.
For surjectivity of the second map, we need to know that every element of \(L^{D_Q}\) is fixed by \(\operatorname{Aut}(L/K)\). We first show this for elements of \(B/Q\) fixed by \(D_Q\).
Let \(b_0 \in B/Q\). Suppose that \(b_0\) is fixed by the stabilizer subgroup \(D_Q\). Then \(b_0\) is fixed by \(\operatorname{Aut}(L/K)\).
Let \(a,b\in B\) be elements from 8.8. Let \(M_b \in A[X]\) be the characteristic polynomial of \(b\). We can map \(M_b\) to \(L[X]\) in two different ways: via \(B[X]\) and via \(K[X]\). Going via \(B[X]\) tells us that the image of \(M_b(X)\) in \(L[X]\) is exactly
But going via \(K[X]\) tells us that this image lies in \(K[X]\), so we must have \(ab_0\in K\). Then \(ab_0\) is fixed by \(\operatorname{Aut}(L/K)\), and \(a\) is nonzero in \(L\) (since \(a\notin Q\)), so \(b_0\) is fixed by \(\operatorname{Aut}(L/K)\).
Now we upgrade this to elements of \(L\) fixed by \(D_Q\). The following lemma will allow us to lift the denominator from \(B/Q\) to \(A/P\).
If \(R/S\) is an algebraic extension of integral domains, then any fraction \(a/b\) with \(a,b\in R\) can be written as \(c/d\) with \(c\in R\) and \(d\in S\).
If \(f\in S[X]\) satisfies \(f(b)=0\), then \(f(0)\in S\) is a multiple of \(b\). If \(f(0)=bx\in S\), then \(a/b=(ax)/(bx)\) as desired.
Let \(x \in L\). Suppose that \(x\) is fixed by the stabilizer subgroup \(D_Q\). Then \(x\) is fixed by the automorphism group \(\operatorname{Aut}(L/K)\).
Combining this with 8.9 finishes the proof.
The map \(D_Q \to \operatorname{Aut}(L/L^{D_Q})\) is surjective by 8.9. For surjectivity of \(\operatorname{Aut}(L/L^{D_Q}) \to \operatorname{Aut}(L/K)\), let \(\sigma \) be a field automorphism of \(L\) fixing \(K\) pointwise. We must show that \(\sigma \) automatically fixes \(L^{D_Q}\) pointwise. But this is exactly 8.12. Thus, the composition \(D_Q \to \operatorname{Aut}(L/L^{D_Q}) \to \operatorname{Aut}(L/K)\) is surjective.