7 Miniproject: Frobenius elements
7.1 Status
This miniproject has been a success: the main results are sorry-free and in the process of being PRed to mathlib (see here). As a result there will be no more work on this project in the FLT repo. Below is a fairly detailed sketch of the argument used.
7.2 Introduction and goal
When this project started, I had thought that the existence of Frobenius elements was specific to the theory of local and global fields, and a slightly more general result held for Dedekind domains. Then Joel Riou pointed out on the Lean Zulip an extremely general result from from Bourbaki’s Commutative Algebra (Chapter V, Section 2, number 2, theorem 2, part (ii)). This beautiful result is surely what we want to see in mathlib. Before we state Bourbaki’s theorem, let us set the scene.
7.3 Statement of the theorem
The set-up throughout this project: \(G\) is a finite group acting (via ring isomorphisms) on a commutative ring \(B\), and \(A\) is the subring of \(G\)-invariants.
Say \(Q\) is a prime ideal of \(B\), whose pullback to \(A\) is the prime ideal \(P\). Note that \(G\) naturally acts on the ideals of \(B\). Let’s define the decomposition group \(D_Q\) of \(Q\) to be the subgroup of \(G\) which stabilizes \(Q\) (just to be clear: \(g\in D_Q\) means \(\{ g\cdot q\, :\, q \in Q\} =Q\), not \(\forall q\in Q, g\cdot q=q\)).
Let \(L\) be the field of fractions of the integral domain \(B/Q\), and let \(K\) be the field of fractions of the subring \(A/P\). Then \(L\) is naturally a \(K\)-algebra. In this generality \(L/K\) may not even be finite or Galois, but we can still talk about \(\operatorname{Aut}_K(L)\).
In the next definition we write down a group homomorphism \(\phi \) from \(D_Q\) to \(\operatorname{Aut}_K(L)\).
Choose \(g\in D_Q\). Then the action of \(g\) on \(B\) gives us an induced \(A/P\)-algebra automorphism of \(B/Q\) which extends to a \(K\)-algebra automorphism \(\phi (g)\) of \(L\). This construction \(g\mapsto \phi (g)\) defines a group homomorphism from \(D_Q\) to \(\operatorname{Aut}_K(L)\) (all the proofs implicit in the definition here are straightforward).
The theorem we want in this mini-project is
The map \(g\mapsto \phi _g\) from \(D_Q\) to \(\operatorname{Aut}_K(L)\) defined above is surjective.
The goal of this mini-project is to get this theorem formalised and ideally into mathlib.
In particular, \(\operatorname{Aut}_K(L)\) is finite, although we prove this beforehand and not as a corollary. What is so striking about this theorem to me is that the only finiteness hypothesis is on the group \(G\) which acts; there are no finiteness or Noetherian hypotheses on the rings at all.
As a trivial consequence we get Frobenius elements for finite Galois extensions in both the local and global field setting, as \(\operatorname{Aut}_K(L)\) is just a Galois group of finite fields in these cases, so by surjectivity we can lift a Frobenius element.
Even though \(G\) is finite, it is possible in characteristic p for the extension \(L/K\) to be infinite (and mostly inseparable). The theorem implies that \(\operatorname{Aut}_K(L)\) is always finite; what is actually happening is that \(L/K\) is algebraic and normal, and its maximal separable subextension is finite of degree at most \(|G|\). We prove these results along the way to the proof of the main theorem.
7.3.1 Examples
These do not need to be formalised.
The basic example is when \(B\) is the ring of integers of a finite Galois extension of \(\mathbb {Q}\) (or equivalently the field obtained by adjoining all of the roots of a monic polynomial with rational coefficients). Then the Galois group \(G\) acts on \(B\), and the invariants \(A\) are just \(B\cap \mathbb {Q}=\mathbb {Z}\). If \(Q\) is a nonzero prime ideal of \(B\) then a standard result is that \(B/Q\) is a finite field. Let’s call this field \(k\). We deduce that \(P=Q\cap \mathbb {Z}\) is a nonzero ideal (as \(\mathbb {Z}/P\) injects into \(B/Q\)) and hence must be the principal ideal generated by a prime number \(p\). This number \(p\) is “the prime below \(Q\)”, and also the characteristic of \(k\).
If \(D\) is the subgroup of \(G\) stabilizing \(Q\) as a set, then \(D\) acts on \(k\), so we get a map from \(D\) to \(\operatorname{Gal}(k/\mathbb {F}_p)\). The theorem states that this map is surjective. Its kernel is the inertia subgroup of \(Q\), which for all but finitely many primes of \(B\) is the trivial group. So in these cases we get an isomorphism from \(D\) to \(\operatorname{Gal}(k/\mathbb {F}_p)\) meaning that \(D\) is cyclic, and furthermore has two canonical generators, one called \(\operatorname{Frob}_Q\) (by Artin) and the other one unfortunately also called \(\operatorname{Frob}_Q\) (by Deligne), which are inverses to each other. Some people think that both Frobenii are canonical, some people (for example the Shimura variety crowd) think that Deligne’s choice is more canonical, and others (for example the Heegner point crowd) think that Artin’s choice is more canonical. It was this example which convinced me that the word “canonical” should be banished from mathematics if not accompanied by a clear explanation of what the author means by the word. I know several examples of papers in the literature where \(\operatorname{Frob}_Q\) is used with no explanation as to which one it is. And sometimes it doesn’t matter. For example, with either choice of normalization the following is true. If \(Q\) and \(Q'\) are two primes above \(p\) then there’s some \(g\in G\) such that \(gQ=Q'\) and one can deduce from this that \(\operatorname{Frob}_Q\) and \(\operatorname{Frob}_{Q'}\) are conjugate. In particular if \(G\) is abelian then \(\operatorname{Frob}_Q\) and \(\operatorname{Frob}_{Q'}\) are equal, so we can call them both \(\operatorname{Frob}_p\).
An example which demonstrates that things can get a bit stranger in characteristic \(p\) is the following (we restrict to \(p=2\) but a generalisation of this pathology exists for all \(p{\gt}0\)). We let \(B=\mathbb {F}_2[X,Y]\) and \(G=\{ 1,\tau \} \) with the involution \(\tau \) fixing \(Y\) and switching \(X\), \(X+Y\), i.e., \((\tau f)(X,Y)=f(X+Y, Y)\). Hence \(\tau \) fixes the sum and product of these two polynomials, and thus \(A\supset \mathbb {F}_2[X^2+XY,Y]\). One can check (and this needs checking) that this is in fact an equality, and I believe that \(B\) is free of rank 2 over \(A\) with basis \(\{ 1,X\} \). Now set \(Q=(Y)\) so \(P=YA\) and the quotient \((B/Q)/(A/P)\) is \(\mathbb {F}_2[X]/\mathbb {F}_2[X^2]\), and the corresponding extension of the fields of fractions of these integral domains is a radical extension \(\mathbb {F}_2(X)/\mathbb {F}_2(X^2)\) of degree 2. In other words, it is finite and normal, but not separable.
It appears that this construction behaves well with respect to tensor products over \(\mathbb {F}_2\) and filtered colimits (I have not thought about how to make this rigorous), and assuming this is true, it will explain Bourbaki’s counterexample to the hope that \(L/K\) is always finite. We now describe the counterexample (exercise 9 of number 2)
The example is from the exercises in Bourbaki (exercise 9 of section 2 above, found at the end of Chapter V). We let \(B=\mathbb {F}_2[X_0,Y_0,X_1,Y_1,X_2,\ldots ]\) be a polynomial ring in infinitely many variables \(X_i\) and \(Y_i\) indexed by the naturals (or indeed by any infinite set), and \(G\) is cyclic of order 2 with the generator acting on \(B\) via \(X_n\mapsto X_n+Y_n\) and \(Y_n\mapsto Y_n\). If \(Q\) is the ideal generated by the \(Y_n\) then now \(L/K\) is a radical extension of infinite degree; the generator swaps \(X_n\) and \(X_n+Y_n\), so it fixes their product \(a_n\in A\), which becomes \(X_n^2\) modulo \(Q\), so all of the \(X_{i}\) will be algebraically independent in \(L/K\) and \(X_{i}^2\in K\).
7.4 The extension \(B/A\).
The precise set-up we’ll work in is the following. We fix \(G\) a finite group acting on \(B\) a commutative ring, and we have another commutative ring \(A\) such that \(B\) is an \(A\)-algebra and the image of \(A\) in \(B\) is precisely the \(G\)-invariant elements of \(B\). We don’t ever need the map \(A\to B\) to be injective so we don’t assume this.
We start with a construction which is fundamental to everything, and which explains why we need \(G\) to be finite.
If \(b\in B\) then define the characteristic polynomial \(F_b(X) \in B[X]\) of \(b\) to be \(\prod _{g\in G}(X-g\cdot b)\).
Clearly \(F_b\) is a monic polynomial of degree \(|G|\). In fact this isn’t clear: if \(B\) is the zero ring then \(F_b=0\) has degree less than \(|G|\).
If \(B\) is nontrivial then \(F_b\) is monic.
Obvious.
It’s also clear that \(F_b\) has degree \(|G|\) and has \(b\) as a root. Also \(F_b\) is \(G\)-invariant, because acting by some \(\gamma \in G\) just permutes the order of the factors. Hence we can descend \(F_b\) to a monic polynomial \(M_b(X)\) of degree \(|G|\) in \(A[X]\). We will also refer to \(M_b\) as the characteristic polynomial of \(b\), even though it’s not even well-defined if the map \(A\to B\) isn’t injective.
\(M_b\) is any monic degree \(|G|\) polynomial in \(A[X]\) whose image in \(B[X]\) is \(F_b\).
\(M_b\) has \(b\) as a root.
Follows from the fact that \(F_b\) has \(b\) as a root.
\(M_b\) has degree \(n\).
Exercise.
\(M_b\) is monic.
Exercise.
\(B/A\) is integral.
Use \(M_b\).
7.5 The extension \((B/Q)/(A/P)\).
Note that \(Q\) is prime, so \(B/Q\) is an integral domain and hence nontrivial. Furthermore, all our polynomials are monic and hence nonzero (indeed they all have degree \(|G|\)>0), so both Lean notions of degree coincide.
If \(\overline{b}\in B/Q\) then we define \(\overline{M}_{\overline{b}}\in A/P[X]\) to be the reduction of \(M_b\) where \(b\) is any lift of \(\overline{b}\) to \(B\).
Say \(\overline{b}\in B/Q\).
\(\overline{M}_{\overline{b}}\) has degree \(|G|\).
Exercise.
\(\overline{M}_{\overline{b}}\) is monic.
Exercise (note that integral domains are nontrivial).
\(\overline{M}_{\overline{b}}\) has \(\overline{b}\) as a root.
Exercise.
\((B/Q)/(A/P)\) is an integral extension.
Use \(\overline{M}_{\overline{b}}\).
Here is a corollary of this result: every nonzero element of \(B/Q\) divides a nonzero element of \(A/P\).
If \(\beta \in B/Q\) is nonzero then there’s some nonzero \(\alpha \in A/P\) such that \(\beta \) divides the image of \(\alpha \) in \(B/Q\).
Note: Is this in mathlib already? This proof works for any integral extension if the top ring has no zero divisors.
Let \(\beta \) be nonzero, and consider the monic polynomial \(\overline{M}_\beta (X)\), which is monic of degree \(|G|\) and has \(\beta \) as a root. Write \(n=|G|\), so \(f\) is monic of degree \(n\). We cannot have \(f(X)=X^n\) as this would imply that \(\beta \) is nilpotent and hence zero, as \(B/Q\) is an integral domain. Hence \(f(X)=X^n+\cdots +\alpha X^d\) for some \(d{\lt}n\) and nonzero \(\alpha \in A\). Again using the fact that \(B/Q\) is an integral domain, we deduce that \(\beta \) must a be root of \(f/X^d=X^{n-d}+\cdots +\alpha =X(\cdots )+\alpha \), and setting \(X=\beta \) shows that \(\beta \) divides \(\alpha \).
7.6 The extension \(L/K\).
If \(\lambda \in L\) then there’s a monic polynomial \(P_\lambda \in K[X]\) of degree \(|G|\) with \(\lambda \) as a root, and which splits completely in \(L[X]\).
A general \(\lambda \in L\) can be written as \(\beta _1/\beta _2\) where \(\beta _1,\beta _2\in B/Q\). The previous corollary showed that there’s some nonzero \(\alpha \in A/P\) such that \(\beta _2\) divides \(\alpha \), and hence \(\alpha \lambda \in B/Q\) (we just cleared denominators). Thus \(\alpha \lambda \) is a root of some monic polynomial \(f(x)\in (A/P)[X]\), by 7.13. The polynomial \(f(\alpha x)\in (A/P)[X]\) thus has \(\lambda \) as a root, but it is not monic; its leading term is \(\alpha ^n\). Dividing through in \(K[X]\) gives us the polynomial we seek.
The extension \(L/K\) is algebraic.
Exercise using 7.16.
The extension \(L/K\) is normal.
Exercise using 7.16.
Note that \(L/K\) might not be separable and might have infinite degree. However
Any subextension of \(L/K\) which is finite and separable over \(K\) has degree at most \(|G|\).
Finite and separable implies simple, and we’ve already seen that any element of \(L\) has degree at most \(|G|\) over \(K\).
The maximal separable subextension \(M\) of \(L/K\) has degree at most \(|G|\).
If it has dimension greater than \(|G|\) over \(K\), then it has a finitely-generated subfeld of \(K\)-dimension greater than \(|G|\), and is finite and separable, contradicting the previous result.
\(\operatorname{Aut}_K(L)\) is finite.
Any \(K\)-automorphism of \(L\) is determined by where it sends \(M\).
\(\operatorname{Aut}_{A/P}(B/Q)\) is finite.
Two elements of \(\operatorname{Aut}_{A/P}(B/Q)\) which agree once extended to automorphisms of \(L\) must have already been equal, as \(B/Q\subseteq L\). Hence the canonical map from \(\operatorname{Aut}_{A/P}(B/Q)\) to \(\operatorname{Aut}_K(L)\) is injective.
7.7 Proof of surjectivity.
We fix once and for all some nonzero \(y\in B/Q\) such that \(M=K(y)\), with \(M\) the maximal separable subextension of \(L/K\).
NB we do use nonzeroness at some point, and \(y\) can be zero in the case \(L=K\) (this seems to have been missed by Bourbaki).
Note that the existence of some \(\lambda \in L\) with this property just comes from finite and separable implies simple; we can use the “clear denominator” technique introduced earlier to scale this by some nonzero \(\alpha \in A\) into \(B/Q\), as \(K(\lambda )=K(\alpha \lambda )\).
Our next goal is the following result:
There exists some \(x\in B\) and \(\alpha \in A\) with the following properties.
\(x=\alpha y\) mod \(Q\) and \(\alpha \) isn’t zero mod \(P\).
\(x\in Q'\) for all \(Q'\not=Q\) in the \(G\)-orbit of \(Q\).
Bourbaki only prove this in the case that \(P\) is maximal (and implicitly use \(\alpha =1\)), but this generalisation seems to be fine. To prove it, we need to talk a little about localization.
Let \(S\) be the complement of \(P\), so \(S\) is a multiplicative subset of \(A\). Write \(A[1/S]\) for the localisation of \(A\) at \(S\), and write \(B[1/S]\) for the localisation of \(B\) at (the image of) \(S\). I suspect that this is the same as \(B\otimes _AA[1/S]\).
The prime ideals of \(B[1/S]\) over \(P[1/S] \subseteq A[1/S]\) biject naturally with the prime ideals of \(B\) over \(P\). More precisely, the \(B\)-algebra \(B[1/S]\) gives a canonical map \(B\to B[1/S]\) and hence a canonical map from prime ideals of \(B[1/S]\) to prime ideals of \(B\). The claim is that this map induces a bijection between the primes of \(B[1/S]\) above \(P[1/S]\) and the primes of \(B\) above \(P\).
Hopefully this is in mathlib in some form already. In general \(\operatorname{Spec}(B[1/S])\) is just the subset of \(\operatorname{Spec}(B)\) consisting of primes of \(B\) which miss \(S\) (i.e., whose intersection with \(A\) is a subset of \(P\)).
The primes of \(B[1/S]\) above \(P[1/S]\) are all maximal.
This follows from Algebra.IsIntegral.isField_iff_isField and the fact that an ideal is maximal iff the quotient by it is a field.
(of theorem): Because all these ideals of \(B[1/S]\) are maximal, they’re pairwise coprime. So by the Chinese Remainder Theorem we can find an element of \(B[1/S]\) which is equal to \(y\) modulo \(Q[1/S]\) and equal to \(0\) modulo all the other primes. This element is of the form \(x/\alpha \) for some \(x\in B\) and \(\alpha \in S\), and one now checks that everything works.
The rest of the surjectivity argument was done by Jou Glasheen in the special case of number fields, in her final project for my Formalising Mathematics 2024 course.
A summary of the argument: we consider the monic degree \(|G|\) polynomial \(f_x\) in \(K[X]\) or \(\overline{M}_x\) in \((A/P)[X]\), which has \(x\) and its G-conjugates as roots. If \(\sigma \in \operatorname{Aut}_K(L)\) then \(\sigma (\overline{x})\) is a root of \(f\) as \(\sigma \) fixes \(K\) pointwise. Hence \(\sigma (\overline{x})=\overline{g(x)}\) for some \(g\in G\) (because we’re over an integral domain), and because \(\sigma (\overline{x})\not=0\) we have \(\overline{g(x)}\not=0\) and hence \(g(x)\notin Q[1/S]\). Hence \(x\notin g^{-1} Q[1/S]\) and thus \(g^{-1}Q=Q\) and \(g\in SD_Q\). Finally we have \(\phi _g=\sigma \) on \(K\) and on \(y\), so they are equal on \(M\) and hence on \(L\) as \(L/M\) is purely inseparable.
This argument is formalised and at the time of writing is being PRed to mathlib by Thomas Browning.