Extra Facts

Statements and proofs of some important facts not in the book, and some alternate proofs of results from the book.

Ring Theory

Gauss’s Lemma

This proof is a small variation on the one in the book. It works over any UFD, but I’ll just prove it for the integers – the main fact one needs to generalize it to a UFD is that irreducible elements generate prime ideals in UFDs. The rest is identical.

Lemma 1 Let \(f(x) \in \mathbb Z[x]\) and suppose that it factors in \(\mathbb Q[x]\) \[f(x) = g(x) h(x).\]

Then, in fact, there is a factorization \[f(x) = \hat g(x) \hat h(x)\] where \(\hat g\) and \(\hat h\) are in \(\mathbb Z[x]\) and have the same degrees as \(g\) and \(h\), respectively.

The basic idea is that denominators in \(g\) have to be canceled by extra factors in \(h\) for the product to end up in \(f\). The difficulty is that things like \[\frac 1 2 = \frac 5 {10}\] makes it hard to pick out “strictly necessary denominators”.

Instead, we just clear all denominators arbitrarily, then “realize” that the denominators didn’t really have to be cleared, and it’s just that it gave us an excuse to move factors around.

Proof. Let \(c\) be an integer such that multiplying by \(c\) clears the denominators. In other words, we can write \(c=ab\) in such a way that \(\hat g = ag\) and \(\hat h = bh\) have integer coefficients, so that \[c f = \hat g \hat h\] is a factorization in \(\mathbb Z[x]\). Note that \(\deg \hat g = \deg g\) and \(\deg \hat h = \deg h\).

We now describe an inductive process for removing \(c\) from such a factorization. If \(c = \pm 1\) there is nothing to do. Otherwise, let \(p\) be a prime divisor of \(c\).

Since the equation above is in \(\mathbb Z[x]\), we can reduce it mod \(p\) to obtain an equation in \(\mathbb F_p[x]\). The entire left hand side is zero: \[0 = \hat g \hat h \mod p.\] Since \(\mathbb F_p[x]\) is an integral domain, this implies either \(\hat g\) or \(\hat h\) is zero mod \(p\), which is to say that \(p\) divides all of its coefficients. In either case, we can cancel the factor of \(p\) from \(c\) and the factor of \(p\) from the appropriate polynomial on the right hand side. This leaves us with \[\frac c p f = \left(\frac {\hat g} p\right) \hat h \] or \[\frac c p f = \hat g \left(\frac{\hat h } p\right),\] where all the polynomials are still in \(\mathbb Z[x]\) and have the same degrees as the original. Now, however, \(c\) has one fewer factor. Repeating the process on the (finitely many) remaining factors of \(c\) yields the claim.

The book’s argument essentially reproves that \(\mathbb F_p[x]\) is a domain.

Field Theory

Normality

We introduce the following definition(s).

Definition 1 Let \(K/k\) be a finite extension. We say it is normal if it satisfies any of the following conditions (which we will soon prove equivalent):

  • (N1) \(K\) is the splitting field of some polynomial in \(k[x]\).
  • (N2) if \(\sigma,\tau\) are embeddings of \(K\) into some field \(L\) over \(k\), then \(\sigma(K) = \tau(K)\), or equivalently, if \(L\) is any field containing \(K\), then every embedding of \(K\) into \(L\) over \(k\) takes \(K\) to itself.
  • (N2’) if \(L\) is an algebraic closure of \(\bar k\) containing \(L\), then every embedding of \(K\) into \(L\) over \(k\) takes \(K\) to itself.
  • (N3) given any \(\alpha \in K\) the minimal polynomial \(g\) of \(\alpha\) over \(k\) factors completely into a product of linear polynomials in \(K[x]\) (i.e. \(K\) contains all the roots of \(g\)).

The two versions of (N2) are related by replacing \(K\) with \(\sigma(K)\) and comparing \(\sigma\inv\tau\) with the identity.

Note that one usually proves that (N1) implies (N2) implies (N2’) implies (N3) implies (N1) which you’re likely to see in other texts, but since we haven’t rigorously constructed the algebraic closure, we will skip (N2’) and point out how it can be included. It’s easy to see (N2) implies (N2’) by letting \(L\) be an algebraic closure, and in the argument that (N2) implies (N3), all one needs is an overfield with a lot of roots, and the algebraic closure is certainly up to this task.

To start, we need the following handy improvement of the lifting lemma from the book. It doesn’t have a name, but I call it the “normal lifting lemma”, or often just “[the] lifting lemma”. For now, the adjective “normal” just means “splitting field”. Sometimes I’ll abbreviate it as the (H)LL. In essence, it improves the usual lifting lemma by observing that the codomain can always be some sufficiently large splitting field.

Lemma 2 Let \(K/k\) be a finite extension, generated by \(\alpha_1,...,\alpha_n\) with minimal polynomials \(g_1,...,g_n\). Let \(\sigma: k \to k'\) some field isomorphism and \(g_i' = g_i^\sigma\) the polynomial obtained by applying \(\sigma\) to the coefficients of \(g_i\). Let \(L'/k'\) be any extension of \(K'\) containing a splitting field of the \(g_i'\).

Then there is an extension \(\tilde \sigma\) of \(\sigma\) to \(K\) whose image is in \(L'\).

Proof. We’ll induct on the number of generators, using the book’s lifting lemma. The point is that \(L'\) has “every possible copy” of the roots of the \(g_i\), so the lift at each step can be induced from an evaluation map landing in \(L\).

To wit, we can carry out one step of the lift, from \(k\) to \(k(\alpha_1)\) with the standard lifting lemma, obtaining a map \(\sigma_1\) extending \(\sigma\) and taking \(\alpha_1\) to some \(\alpha_1' \in L'\) a root of \(g_1^\sigma\). This bring us to a finite extension \(K/k(\alpha)\) with an isomorphism \(\sigma_1\) from \(k(\alpha_1)\) to \(k'(\alpha_1')\). Note that the minimal polynomials of each \(\alpha_i'\) over \(k'(\alpha_1')\) need not be the original \(g_i'\), but they do still divide the \(g_i'\), and hence \(L'\) still contains the splitting field. By induction, finish the extension to \(K = k(\alpha_1)(\alpha_2,...,\alpha_n)\). At each step, we produce a \(\sigma_i\) extending \(\sigma_{i-1}\), hence extending \(\sigma\), so the final map \(\sigma_n\) extends \(\sigma\) and takes \(K\) to \(L\).

It’s worth noting that this lemma extends to \(K/k\) algebraic, not just finite, as long as \(L'/k'\) also has “enough roots”: the minimal polynomial over \(k\) of any \(\alpha \in K\) splits completely in \(L\). Keep this in mind as we go to the next section, and define normal extensions!

Also, although we haven’t rigorously constructed the algebraic closure, it’s worth noting because the algebraic closure contains every splitting field, the lemma says that any finite (or algebraic) extension of \(k\) has an embedding into the algebraic closure. In other words, every extension of \(k\) can be “placed” into the algebraic closure and compared within it, rather than leaving them floating around independently.

With the normal lifting lemma in hand, we can verify that our definition of normality is a true definition.

Lemma 3 The conditions (N1), (N2), and (N3) are equivalent.

Proof. Assume (N1), so \(K\) is the splitting field of some polynomial \(f(x) \in k[x]\). If \(L\) is a field containing \(K\), then it contains all the roots \(\alpha_1,...,\alpha_n\) of \(f\), and the splitting field is \(K = k(\alpha_1,...,\alpha_n)\). The image of any other embedding of \(K\) is still a splitting field for \(f\), and hence coincides with \(K\). The point is that a splitting field is determined entirely by information in \(k\) (the polynomial \(f\)) which is fixed by an embedding.

Next, assume (N2). Let \(\alpha \in K\) over \(k\), with \(g\) its minimal polynomial over \(k\). Let \(\alpha_i\) be generators of \(K\) over \(k\), with minimal polynomial \(g_i\) over \(k\). Let \(E\) be the splitting field of the product of \(g\) and all the \(g_i\). Take \(\alpha'\) some root of \(g\) in \(E\).

We will construct an embedding of \(K\) into \(E\) which sends \(\alpha\) to \(\alpha'\), so the image of that embedding will contain \(\alpha'\). Meanwhile, (N2) tells us that the image is still \(K\), so that \(\alpha'\) must have been in \(K\). Since \(\alpha'\) was arbitrary, all roots of \(g\) are in \(K\), and hence it factors completely.

Since \(g\) is irreducible and \(\alpha,\alpha'\) roots of it, there is an isomorphism \(\sigma: k(\alpha) \to k(\alpha')\) over \(k\) which takes \(\alpha\) to \(\alpha'\). Sine \(K\) is still generated by the \(\alpha_i\) over the larger field \(k(\alpha_i)\) and \(E\) contains all the roots of their minimal polynomials, we can apply the normal lifting lemma to \(K/k(\alpha)\) and \(\sigma\) to produce an embedding of \(K\) into \(E\).

Now assume (N3). Let \(\alpha_1,...,\alpha_n\) be generators for \(K\) over \(k\), meaning \(K=K(\alpha_1,...,\alpha_n)\). By (N3), the minimal polynomial \(g_i\) of \(\alpha_i\) over \(k\) splits completely in \(K\). It follows that \(K\) contains a splitting field of the product of the \(g_i\); the reverse containment is immediate because each \(\alpha_i\) is among the roots of \(g\).

Note that in (N2) and (N2’), any map which takes \(K\) to itself must be an isomorphism. Field homomorphisms are injective, and a field homomorphism over \(k\) is also a homomorphism of \(k\)-vector spaces. Every injective map between \(k\)-vector spaces of the same finite dimension is an isomorphism.

Separability

As before, we define a few conditions that we will prove are equivalent:

Definition 2 Let \(\alpha\) be algebraic over \(k\). We say that \(\alpha\) is separable if its minimal polynomial over \(k\) has no repeated roots.

Let \(K/k\) be a finite extension. We say that it is separable if it satisfies any of the following:

  • (S1) \(K=k(\alpha_1,...,\alpha_n)\) where the \(\alpha_i\) are separable over \(k\).
  • (S2) \(K\) has \([K:k]\) distinct embeddings into any normal extension containing it.
  • (S2’) \(K\) has \([K:k]\) distinct embeddings into any algebraic closure of \(K\).
  • (S3) every element of \(K\) is separable over \(k\)

To ease exposition, we make the following definition:

Definition 3 Let \(K/k\) be a finite extension and \(L/k\) some normal extension containing \(K\). The separability degree of \(K/k\) is the number of embeddings of \(K\) into \(L\). It is denoted \([K:k]_s\). Exercise: prove that it is independent of \(L\); take two normal extensions \(L\) and \(M\), which can be embedded them both into a larger normal extension \(N\) using the HLL, then verify that the number of embeddings of \(K\) into \(N\) is the same as the number into \(L\) and \(M\).

We recall a remark from the book:

If \(k(\alpha)/k\) is a primitive extension, with minimal polynomial \(g\), and \(L/k\) any extension containing a splitting field for \(g\), then the number of embeddings of \(k(\alpha)/k\) into \(L\) is precisely the number of distinct roots of \(g\). In other words, \([k(\alpha):k]_s\) is the number of distinct roots of \(g\).

In fact, the book remarked more, which we will now prove:

Lemma 4 Consider a tower \(L/K/k\) of finite extensions. Then \([L:k]_s = [L:K]_s[K:k]_s\). In other words, separable degree multiplies in towers.

Moreover, \([L:k]_s \leq [L:k]\).

Proof. We induct on the (usual) degree. There’s nothing to do if any of the degrees are \(1\), so we assume otherwise. Since \(L/K\) is finite, it can be written as \(K(\alpha_1,...,\alpha_n)\). Let \(n\) be minimal. Then \(E = K(\alpha_1,...,\alpha_{n-1}) \neq L\). This gives us a tower \(L=E(\alpha_n)/E/K/k\)

By induction, we have both \[[E:k]_s = [E:K]_s[K:k]_s.\] and \[[L:K]_s = [E(\alpha_n):E]_s[E:K]_s.\]

From the lifting lemma/remark, we know that each embedding of \(E\) into a sufficiently large normal extension has \([E(\alpha_n):E]_s\) extensions to \(E(\alpha_n) = L\). To see this, let \(f\) be the minimal polynomial of \(\alpha_n\) over \(E\). Lifts come from choosing a root of \(f^\sigma\), and it’s clear that \(f^\sigma\) has as many distinct roots as \(f\), which is precisely \([E(\alpha_n):E]_s\) by the initial remark. Therefore, \[[L:k]_s = [E(\alpha_n):E]_s[E:k]_s.\]

Simplifying with the two inductive expressions eliminates \(E\) and \(E(\alpha_n)\), leaving the desired equality.

As for the final claim, we know it is true for primitive extensions by the remark. The finite extension \(L\) can be obtained as a sequence of primitive extensions \(k(\alpha_1)/k\), \(k(\alpha_1,\alpha_2)/k(\alpha_1)\), …, so multiplicativity extends the inequality to the whole tower.

Note that multiplicativity implies that in a tower \(L/K/k\), the whole extension \(L/k\) is separable if and only if both \(L/K\) and \(K/k\) are separable.

Theorem 1 The three definitions (S1), (S2), (S3) are equivalent.

Proof. Clearly (S3) implies (S1) when the extension is finite.

Next, (S1) implies (S2). Note that if \(\alpha\) is separable over \(k\), meaning its minimal polynomial has no repeated roots, then it is separable over any extension \(E\) of \(k\), because the minimal polynomial of \(\alpha\) over \(E\) divides the minimal polynomial over \(k\). Therefore, \(\alpha_i\) is separable over \(k(\alpha_1,...,\alpha_{i-1})\), hence the degree coincides with separable degree at each step. Then multiplicativity of separable degree in towers tells us that (S1) implies (S2).

Finally, rather than (S2) implies (S3), we verify the contrapositive. Suppose (S3) does not hold, so some \(\alpha \in K\) is not separable over \(k\), meaning \([k(\alpha):k]_s < [k(\alpha):k]\). Applying our degree formula to the tower \(K/k(\alpha)/k\), this would lead to \[[K:k]_s = [K:k(\alpha)]_s [k(\alpha):k]_s \leq [K:k(\alpha)] [k(\alpha):k]_s < [K:k(\alpha)] [k(\alpha):k] = [K:k].\]

Thus the number of embeddings is smaller than the degree, so (S2) does not hold.

Lastly, (S3) implies (S2). We will induct on degree over \(k\). Using finiteness, write \(K = k(\alpha_1,...,\alpha_n)\) with \(n\) minimal, and let \(L=k(\alpha_1,...,\alpha_{n-1})\) so that \(K = L(\alpha_n)\). By minimality, \(L\neq K\) so \([L:k] < [K:k]\). Note that if \(K\) is (S3) then so is (L), hence by induction \([L:k]_s = [L:k]\). To conclude, we just need to verify \([K:L] = [K:L]_s\), as then multiplicativity of regular and separable degree will yield \[[K:k] = [K:L][L:k] = [K:L]_s [L:k]_s = [K:k]_s.\]

We’ve reduced to \(K = L(\alpha_n)\) so that \([K:L]_s = [L(\alpha_n):L]_s\) is the number of distinct roots of the minimal polynomial of \(\alpha_n\) over \(L\), which will be the degree of the minimal polynomial as long as it is separable. This is certainly the case – \(\alpha_n\) is separable over \(k\), meaning its minimal polynomial over \(k\) has no repeated roots. The minimal polynomial over \(L\) divides the minimal polynomial over \(k\) (in \(L[x]\)) and so it cannot have repeated roots. Therefore, \([L(\alpha_n):L]_s = [L(\alpha_n):L]\) and we are done.

As in the case of normality, one can incorporate (S2’), the algebraic closure, by observing that any algebaic closure contains a sufficiently large finite normal extension to carry out the necessary arguments.

Finally, we prove an interesting upgrade to (S1), known as the primitive element theorem.

Theorem 2 Suppose \(K/k\) is a finite separable extension. Then \(K\) is primitive over \(k\), meaning there is some \(\gamma \in K\) such that \(K = k(\gamma)\).

Proof. If \(k\) is finite, so is \(K\), and we can let \(\gamma\) be a generator of the multiplicative subgroup. So we may assume \(k\) is infinite.

By a straightforward induction, since finite extensions are finitely generated, we can reduce to the case \(K=k(\alpha,\beta)\). We will show that an element of the form \[\gamma = \alpha + c\beta\] will generate \(K\) for all but finitely many \(c\) in \(k\). The reason we excluded the case of \(k\) finite is to ensure that there will always be at least one usable \(c\).

We observed above that \([k(\gamma):k] \geq [k(\gamma):k]_s\), and from (S3) implies (S2) we also have \([K:k] = [K:k]_s\). Since \(k(\gamma)\) is a subfield of \(K\), therefore, it suffices to show \([k(\gamma):k]_s \geq [K:k]_s\).

Let \(\sigma\neq \tau\) be two embeddings of \(K\) into some large normal extension. Our aim is to show that, if \(c\) is chosen well, then \(\sigma(\gamma)\neq \tau(\gamma)\). This will tell us that each distinct embedding of \(K\) into some large field yields a distinct embedding of \(k(\gamma)\) into that same field, and therefore \([k(\gamma):k]_s \geq [K:k]_s = [K:k]\).

Reorganizing, we see that we want \[0 \neq \gamma^\sigma - \gamma^\tau = (\alpha^\sigma - \alpha^\tau)+ c (\beta^\sigma - \beta^\tau)\]

If the coefficient of \(c\) were zero, then \(\beta^\sigma = \beta^\tau\) and hence \(\alpha^\sigma \neq \alpha^\tau\), else \(\sigma = \tau\), so the above non-equality would hold regardless of \(c\). So assume otherwise, and rearrange further: \[c \neq \frac{\alpha^\tau - \alpha^\sigma}{\beta^\sigma - \beta^\tau}.\]

The right hand side describes only finitely many elements, because there are only finitely many choices for each \(\alpha^\tau,\alpha^\sigma\) resp. \(\beta^\sigma,\beta^\tau\), as they’re all roots of the minimal polynomial of \(\alpha\) resp. \(\beta\) over \(k\).

Galois Theory

One can summarize pieces of our results on finite normal and separable extensions in two slogans based on (N2) and (S2):

\(K/k\) is normal when every embedding is an automorphism.

\(K/k\) is separable when it has as many embeddings as it should (its degre).

A finite extension is called Galois if it is both normal and separable:

\(K/k\) is Galois when it has as many automorphisms as it should (its degree).

To check these in a concrete case, though, one uses (N1) and (S2). Our characterizations of normality and separability furnish

Theorem 3 Let \(K/k\) be a finite extension. The following are equivalent to being Galois:

  • (G1) It is the splitting field of a separable polynomial,
  • (G2) It is Galois if and only if it has \([K:k]\) automorphisms over \(k\).

It follows that \(K/k\) is Galois and \(K/E/k\) is an intermediate extension, then \(K/E\) is also Galois. In fact, if \(L/K\) is Galois and \(E/K\) is any extension, all in a common overfield, then \(EL/E\) is Galois.

It may be worth noting that one can give a very pared-down development of normality and separability to define Galois extensions quite quickly: a version of the PET can be proven without reference to embeddings, and will just tell you that the splitting field of a separable polynomial is of the form \(k(\gamma)/k\), avoiding separable degree and so on. The facts we proved about normal extensions are much simpler for primitive extensions, which after all formed the base case for our inductions. But inseparable normal extensions are really interesting, so it’s better to develop a theory which includes them!

However, the proofs are worth the detour, and having a variety of equivalent definitions is helpful. As I’ve organized them, (N1) and (S1) are each to check in concrete cases, while (N2) and (S2) are often useful for proving general facts or checking extensions not concretely associated to polynomials, and (N3) and (S3) tell you about properties of field elements based on the extension plus lend themselves to inductive arguments. When we prove the funmental theorem(s) of Galois theory, all of them will find a use!

Given a group \(H\) acting on a set \(K\), we denote by \(K_H\) the subset of elements in \(K\) which are fixed by \(H\), i.e. \[K_H = \{k\in K\ |\ k^h = k\ \textrm{ for all }\ h\in H\}.\] You will often see this written as \(K^H\) instead, but that conflicts with my notation for group actions.

Theorem 4 (Fundamental Theorem of Galois Theory, Part I)

Let \(K/k\) be a finite Galois extension with Galois group \(G\). Consider the following two maps:

  • From subextensions to subgroups: \(L\mapsto \Gal(K/L)\)
  • From subgroups to subextensions: \(H\mapsto K_H\)

These maps are inverses to each other. Equivalently, \[\Gal(K/K_H) = H,\] and \[L = K_{\Gal(K/L)}.\]

Therefore, there is a bijection between subfields of \(K/k\) and subgroups of \(\Gal(K/k)\). Moreover, it is order-reversing; if \(L, L'\) correspond to subgroups \(H, H'\), then \(L\subseteq L'\) is and only if \(H\supseteq H'\).

Proof. It’s easy to see the maps are order-reversing: if \(L\subseteq L'\) then every automorphism fixing \(L'\) fixes \(L\), conversely if \(H\supseteq H'\) then everything in \(K\) fixed by all of \(H\) is also fixed by all of \(H'\).

We will verify the equivalent equalities. Note that in each case there is an “easy” containment. Namely, \[H\subseteq \Gal(K/K_H),\] because every \(h\in H\) is an automorphism of \(K\) fixing \(K_H\) hence in \(\Gal(K/K_H)\) by definition, and \[L\subseteq K_{\Gal(K/L)},\] because every element of \(L\) is, by definition, fixed by every automorphism in \(\Gal(K/L)\).

Let’s improve the first containment to an equality. Notice that containment implies \(|H| \leq |\Gal(K/K_H)|\), and that the two will be equal if we prove \(|H| \geq |\Gal(K/K_H)|\), so this will be our aim. Since \(K/k\) is finite separable, so is \(K/K_H\), so we may write \(K = K_H(\gamma)\) by (S1). Consider the polynomial \[g(x) = \prod_{\sigma \in H} x - \gamma^\sigma.\]

Notice that for \(\tau \in H\), we have \[g^\tau(x) = \prod_{\sigma \in H} x - \gamma^{\sigma\tau},\] which is the same product as defines \(g\), just in a different order. So \(g^\tau(x) = g(x)\). In particular, the coefficients of \(g\) are all in \(K_H\).

Therefore, \(K = K_H(\gamma)\) and \(\gamma\) is a root of a polynomial of degree \(|H|\). Therefore, \[|\Gal(K/K_H)| = [K:K_H] = [K_H(\gamma):K_H] \leq \deg g = |H|.\]

Now we turn to the second containment. Note that, because \(K/L\) is Galois, we have \[|\Gal(K/L)| = [K:L].\]

Then take the fact we already proved, \[H = \Gal(K/K_H),\] with \(H=\Gal(K/L)\), and apply the same observation connecting degree and the order of the Galois group to obtain \[[K:K_H] = |\Gal(K/L)|.\]

Therefore, we have \([K:K_H] = [K:L]\), while the “easy” containment we started from was \(L\subseteq K_H\). The only way for these degrees to agree and have this containment is for \([K_H:L] = 1\), meaning \(L=K_H\).

Note that the “trick” polynomial construction is actually familiar to us: we used exactly this same idea to relate \(\QQ(\zeta + \zeta\inv)\) to \(\QQ(\zeta)\) before we’d ever talked about Galois extensions.

The second half of the fundamental theorem describes how the Galois action moves subfields.

Theorem 5 (Fundamental Theorem of Galois Theory, Part II)

Suppose \(K/k\) is finite Galois, with Galois group \(G\). Let \(H\) be a subgroup of \(G\) and \(L=K_H\) the associated subfield. For any \(\sigma\) in \(G\) we have

\[\begin{align*} K_{H^\sigma} &= (K_H)^\sigma\\ \Gal(K/L)^\sigma &= \Gal(K/L^\sigma) \end{align*}\]

Proof. A series of equivalences:

\[\begin{align*} \alpha \in K_{H^\sigma} &\Leftrightarrow \alpha = \alpha^{\sigma \inv h \sigma}&\ \textrm{ for all }\ h\in H\\ &\Leftrightarrow \alpha^{\sigma \inv} = \alpha^{\sigma \inv h}&\ \textrm{ for all }\ h\in H\\ &\Leftrightarrow \alpha^{\sigma \inv} \in K_H\\ &\Leftrightarrow \alpha \in (K_H)^\sigma \end{align*}\]

For the second, let \(H = \Gal(K/L)\) and apply the FTGT Part I to this equality, \[\Gal(K/L)^\sigma = H^\sigma = \Gal(K/K_{H^\sigma}) = \Gal(K/L^\sigma).\]

We can use this to characterize Galois extensions – they correspond to normal subgroups.

Theorem 6 (Fundamental Theorem of Galois Theory, Part III)

Let \(K/k\) be a finite Galois extension and \(K/L/k\) an intermediate extension. Then \(L/k\) is Galois if and only if \(\Gal(K/L)\) is normal in \(\Gal(K/k)\). Moreover, \(\Gal(L/k)\) is isomorphic to the quotient \(\Gal(K/k)/\Gal(L/k)\) with the isomorphism induced by the restriction from \(K\) to \(L\).

Proof. Since \(K/k\) is separable, the intermediate extension \(L/k\) is also separable, so being Galois is equivalent to being normal. In this case, we rely in (N2) with overfield \(K\): \(L/k\) is normal if and only if every embedding of \(L\) into \(K\) over \(k\) takes \(L\) to \(L\).

First suppose \(\Gal(K/L)\) is normal and let \(\sigma:L\to K\) be an embedding; by the NLL, \(\sigma\) extends to some automorphism \(\tilde \sigma\) of \(K\). Then \(L^\sigma = L\) if and only if \(L^{\tilde \sigma} = L\), which is equivalent, by FTGT Part II, to \(\Gal(K/L)^{\tilde \sigma} = \Gal(K/L)\). By assumption \(\Gal(K/L)\) is normal, so the latter does indeed hold.

Conversely, suppose \(L/k\) is normal and let \(\sigma \in \Gal(K/k)\). Since \(\sigma\) takes \(K\) to \(K\), it embeds \(L\) into \(K\), so by normality we must have \(L^\sigma = L\). Using Part II again, we see \[\Gal(K/L)^\sigma = \Gal(K/L^\sigma) = \Gal(K/L),\] so \(\Gal(K/L)\) is indeed normal.

Finally, normality tells us that restriction to \(L\) takes \(\Gal(K/k)\) to \(\Gal(L/k)\), because every embedding is an automorphism. The kernel consists of precisely the automorphisms in \(\Gal(K/k)\) which fix \(L\) pointwise, which is just \(\Gal(K/L)\), and the NLLs tell us that the restriction is surjective, because any \(\sigma \in \Gal(L/k)\) lifts to \(K\).

Group Theory

Cyclic Groups

It is easy to prove that if a finite group is cyclic then it has at most one subgroup of each order (dividing its order, by Lagrange, in which case this subgrup exists). The converse is also true, and useful for studying fields.

This is an alternative, somewhat simpler, to my taste, proof of Corollary 8.10. The book’s proof by way of Lemma 8.11 is still quite interesting, though – one can use 8.11(b) as a starting point for proving the fundamental theorem of finite abelian groups. Compare also Theorem 9.57.

Note that field theory is what suggests an induction using an element of minimal order here, rather than maximal order, as in the book’s proof.

Theorem 7 Let \(G\) be a finite group of order \(n\). Suppose that \(G\) has at most \(1\) subgroup of each order \(d\leq n\). Then \(G\) is cyclic.

Proof. We proceed by induction on the order \(n\); if \(n\) is prime then we are done. Note that the hypothesis on \(G\) is inherited by subgroups. Therefore, we may assume that all proper subgroups of \(G\) are cyclic. It is also useful to observe that the assumption of the theorem implies every subgroup of \(G\) is normal: the conjugate of a subgroup has the same order, and hence coincides with the original.

Let \(g\) be a non-identity element of minimal order, so its order is a prime \(p\). By induction, \(G/\langle g \rangle\) is cyclic and therefore generated by the image of some \(h\) in \(G\), from which it follows that \(G\) is generated by \(g\) and \(h\). Let \(d\) be the order of \(h\).

If \(p\) divides the order of \(h\) then \(h^{d/p}\) has order \(p\), and so \(\langle g\rangle\) and \(\langle h^{d/p}\rangle\) are subgroups of \(G\) of the same order, which implies \(g\) is in \(\langle h \rangle\) and therefore \[G = \langle g,h \rangle = \langle h \rangle\] is cyclic.

Otherwise, \(p\) does not divide \(d\). By Lagrange, then, the intersection of \(\langle h\rangle\) and \(\langle g \rangle\) is trivial. Since the two subgroups are normal, this implies \(g\) and \(h\) commute and moreover that \[G = \langle g,h\rangle \cong \langle g \rangle \times \langle h \rangle.\] But the product of cyclic groups of coprime order is cyclic, so again \(G\) is cyclic!

Corollary 1 The multiplicative group \(\mathbb F_q^*\) of a finite field is cyclic. In particular, it has an element of order \(d\) if and only if \(d\) divides \(q-1\).

Exercise: adapt the argument to show that any finite multipicative subgroup of a field is cyclic.

Proof. Let \(H\) be a subgroup of \(\mathbb F_q^*\) of order \(d\). By Lagrange’s theorem, every element of \(H\) is a root of \(x^d - 1\). That polynomial has at most \(d\) roots, and so its roots are precisely the subgroup \(H\) – so \(H\) is determined entirely by the integer \(d\), and hence \(\mathbb F_q^*\) has at most one subgroup of each order \(d\), according to whether or not \(x^d - 1\) splits completely.

Therefore, Theorem 7 implies the group \(\mathbb F_q^*\) is cyclic.

Corollary 2 The Galois group of \(\mathbb F_{q^s}\) over \(\mathbb F_q\) is cyclic of order \(s\).

Proof. We have established that there is at most one finite field of each order, hence at most one of each degree, contained in \(\mathbb F_{q^s}\), hence at most one extension of \(\mathbb F_q\) contained in \(\mathbb F_{q^s}\). Therefore, the Galois correspondence tells us that there is at most one subgroup of this Galois group of each order and so Theorem 7 tells us that the Galois group is cyclic!

Affine Groups

Lots of interesting groups come from linear algebra. These two might be familiar:

Definition 4 The \(n\)-dimensional general linear group over \(R\), denoted \(\GL_n(R)\) or sometimes \(\GL(R,n)\) is the group of invertible linear maps from \(R^n\) to \(R^n\) under composition. Usually, \(R\) is a field, and so this is the group of isomorphisms from an \(n\)-dimensional vector space to itself.

The (\(n\)-dimensional) special linear group, denoted \(\SL_n(R)\) is the subgroup of \(\GL_n(R)\) of linear maps whose determinant is \(1\).

The (\(n\)-dimensional) projective special linear group, denoted \(\PSL_n(R)\) is the quotient of \(\SL_n(R)\) by the subgroup \(\pm 1\). When \(R\) is a field, scalar multiplication is a linear map, and the scaling maps form a normal subgroup, and if it is also algebraically closed (or closed under \(n\)th roots) then one can obtain \(\PSL_n(R)\) as the quotient of \(\GL_n(R)\) by the subgroup of scaling maps, hence the name “projective”.

These groups are built out of linear automorphisms. It’s natural to wonder about the affine automorphisms. Recall that a map \(A:R^n\to R^n\) is called affine if it is of the form \[A(x) = T(x) + y\] where \(T\) is a linear map and \(y\) is a fixed element of \(R^n\) depending only on \(A\).

Definition 5 The \(n\)-dimensional affine linear group over \(R\), denoted \(\AGL_n(R)\) or \(\AGL(R,n)\) is the group of invertible affine maps from \(R^n\) to \(R^n\) under composition.

Note: sometimes I prefer to compose from right-to-left, so that \[(A\circ B)(x) = B(A(x)).\] This better matches my right-action convention for field automorphisms.

Here are some useful facts about \(\AGL\) that you should verify - The “pure translations”, maps of the form \(A(x) = x+y\) form a normal subgroup. - The “pure transformations”, maps of the form \(A(x) = T(x)\) with \(T\) linear form a subgroup which is not usually normal. - The translation and transformation subgroups intersect trivially, and generate \(\AGL\). - The quotient of \(\AGL\) by the subgroup of translations is isomorphic to \(\GL\), and this map is an isomorphism when restricted to the transformation subgroup.

The most important case for us is when \(R=\mathbb Z/n\mathbb Z\) and the dimension is \(1\). In this case, you can verify that \[|\AGL_1(\mathbb Z/n\mathbb Z)| = n \phi(n).\] Often, we’ll just write \(\AGL(\mathbb Z/n\mathbb Z)\) or even \(\AGL(n)\) for this group.

When \(n=p\) is prime, the linear maps are scalar multiplication by \((\mathbb Z/p\mathbb Z)^*\), which is a cyclic group. If \(a\) is any generator of the multiplicative group, then we can also describe \(\AGL(p)\) by generators and relations: \[\AGL \cong \langle \sigma,\tau : \sigma^p = \tau^{p-1} = \id, \tau\inv \sigma \tau = \sigma^a\rangle.\] This identifies \(\sigma\) with \(x+1\) and \(\tau\) with \(ax\) (composed correctly).

Sylow Theorems

There are some fun applications of Sylow’s theorems, so a proof is included here.

Given a finite group \(G\) and prime \(p\), we can factor \(|G| = p^r m\) with \(m\) not divisble by \(p\). We say that a subgroup of \(G\) is a \(p\)-Sylow subgroup if it has order \(p^r\).

Theorem 8 Let \(p\) be a prime and \(G\) a finite group of order \(p^rm\) where \(p\) does not divide \(m\).

(Sylow I) \(G\) has a subgroup of order \(p^r\). In other words, it has a \(p\)-Sylow subgroup.

(Sylow II) Let \(P\) be a Sylow subgroup of \(G\). Every subgroup of order \(p^k\) in \(G\) is contained in a conjugate of \(P\). In particular, any two Sylow subgroups are conjugate.

(Sylow III) Let \(n_p\) be the number of subgroups of \(G\) of order \(p^r\). Then \(n_p = 1\) mod \(p\) and \(n_p\) divides \(m\) (in fact, \(n_p\) divides \([G:N_G(H)]\) for any subgroup \(H\) of order \(p^r\)).

(Sylow IV)

Proof. (Sylow I) This has three main steps. Say that a group has the Sylow property if it satisfies Sylow I. First, we’ll show that if \(S\) has the sylow property and \(G\) is a subgroup of \(S\), then \(G\) also has the Sylow property. Next, we’ll show that every group \(G\) embeds into \(\GL_n(\FF_p)\) for a large enough \(n\). Finally, we show that \(\GL_n(\FF_p)\) has the Sylow property.

  1. Let \(S\) be a group with the Sylow property, witnessed by a subgroup \(P\). Let \(G\) act on the cosets of \(P\) by translation. The stabilizer of a coset \(sP\) consists of \[\begin{align*} \{g \in G\ :\ gsP = sP\} &= \{g\in G\ :\ s\inv g s \in P\}\\ &= \{g\in G\ : g\in sPs\inv \}\\ &= G\cap (sPs\inv). \end{align*}\] So by Lagrange the stabilizers all have order dividing \(|sPs\inv| = |P|\). The orbit-stabilizer theorem tells us that \(|S/P|\) is the sum of indices of stabilizers. The order of \(|S/P|\) is prime to \(p\) by assumption, so it is not possible for all the stabilizers to have index divisible by \(p\), and hence at least one stabilizer has index of order prime to \(p\). Such a stabilizer must have order \(p^r\).

  2. This is a mild adaptation of Cauchy’s theorem: label a basis of \(\FF_p^{|G|}\) by the elements of \(G\) and let \(G\) act by permuting indices. This is clearly an invertible linear action and it induces an injective homomorphism from \(G\) to \(\GL_{|G|}(\FF_p)\).

  3. By a straightforward induction, you can show that \(|\GL_n(\FF_p)| = p^{\binom{n}{2}} m\) with \(m\) not divisible by \(p\) (hint: count bases). It’s easy to see that the group of upper-triangular matrices with \(1\)s on the diagonal has that order. Note: it’s interesting that this gives us a simultaneous triangularization of a Sylow subgroup whenever \(G\) embeds into \(\GL_n(\FF_p)\) (all Sylow subgroups, once we prove Sylow II).

(Sylow II) Let \(H\) be a subgroup of order \(p^k\), and have it act on the cosets of \(P\) by translation. Every stabilizer has index dividing \(p^k\), and the action is on a set of order not divisible by \(p\), so it follows that there is at least one fixed point for the action, meaning a coset \(xP\) such that \(hxP = xP\) for all \(h\) in \(H\), which we saw above means \(H\) is contained in \(x P x\inv\), a Sylow subgroup. If \(H\) is Sylow, it has the same order as \(P\) and its conjugates, so containment implies equality.

(Sylow III) The orbit-stabilizer theorem already tells us that the number of conjugates of \(H\) is \([G:N_G(H)]\), which divides \([G:H] = m\). Also, Sylow II tells us that \(H\) must be the unique subgroup of order \(p^r\) in \(N_G(H)\). Now let \(H\) act on the cosets of \(N_G(H)\) by translation. Since \(H\) is a \(p\)-group, we know that \([G:N_G(H)]\) is congruent to the number of fixed points of this action mod \(p\). But notice \[haN_G(H) = aN_G(H)\] for all \(h\in H\) if and only if \(H^a \subseteq N_G(H)\). But \(|H^a| = |H|\), so by our uniqueness observation, \(H^a = H\), in which case \(a\in N_G(H)\) so \(aN_G(H) = N_G(H)\). Therefore, only one coset is fixed, hence $n_p = 1 $ mod \(p\).

An alternative proof is based on Cauchy’s theorem (if \(p\) divides \(|G|\) then \(G\) has a subgroup of order \(p\)) and an orbit-stabilizer argument that if \(H\) is a \(p\)-group then \([N_G(H):H]\) is divisible by \(p\) as long as \(p\) divides \([G:H]\). One could observe that our proof of Sylow I implies Cauchy (or adapt it). The advantage of this approach is that it proves every \(p\)-subgroup of \(G\) is contained in a Sylow subgroup. This could also be proven along the lines above, though.

One of the original proofs of Sylow I was similar to the above, but used \(S_n\) instead of \(\GL_n(\FF_p)\) – in fact, it was Cauchy who found that subgroup of \(S_n\)!

Using the same counting ideas, one can show that a \(p\)-group always has a nontrivial center, from which a lot of other facts follow.

Theorem 9 Let \(P\) be a group of order \(p^r\) for some prime \(p\). The center of \(P\) is nontrivial.

The center has a subgroup of order \(p\), necessarily normal in \(P\). It follows by follows by induction that there’s a tower \(\{\id\} = P_0 \subseteq P_1 \subseteq ... \subseteq P_r = P\) of subgroups of \(P\) such that \(P_i}\) is normal of index \(p\) in \(P_{i+1}\).

Therefore, \(P\) is solvable and its commutator is nontrivial.

Note that we could instead use our proof of Sylow I to construct the normal tower by explicitly writing one down for the subgroup of upper-triangular-diagonal-one matrices – if you let \(U_{n,k}\) be the subgroup of them with the first \(k\) off-diagonals all zero, you can show without too much difficulty that \(U_{n,k}\) is normal in \(U_{n,k-1}\) and that the quotient is isomorphic to \((\ZZ/p\ZZ)^{k-1}\). In fact, \(U_{n,k}\) is normal in \(U_n\)…!

Solvable Groups

Solvable groups split up into a tower of abelian quotients:

Definition 6 Let \(G\) be a group. If there is a normal series (normal tower) \[\{\id\} = G_0 \triangleleft G_1 \triangleleft ... \triangleleft G_{n-1} \triangleleft G_n = G\] where each quotient \(G_{i+1}/G_i\) is abelian.

This property behaves well with subgroups and quotients:

Theorem 10 If \(G\) is solvable then every subgroup \(H\) of \(G\) and quotient \(G/N\) of \(G\) is solvable.

Proof. Since \(G\) is solvable, take a normal tower \[\{\id\} = G_0 \triangleleft G_1 \triangleleft ... \triangleleft G_{n-1} \triangleleft G_n = G\] with abelian quotients.

Consider the subgroups \(NG_i\) of \(G\) containing \(N\). It is straightforward to verify that \(NG_{i+1}\) normalizes \(NG_i\). Moreover, there is a natural surjection from \(G_{i+1}/G_i\) onto \(NG_{i+1}/NG_i\), and so each quotient \(NG_{i+1}/NG_i\) is abelian. Then, the subgroup-quotient-isomorphism correspondence tells us that \(NG_i/N\) is a normal subgroup of \(NG_{i+1}/N\) and moreover \[(NG_{i+1}/N)/(NG_i/N) \cong NG_{i+1}/NG_i.\] So the tower of \(NG_i/N\) in \(G/N\) is a normal tower with abelian quotients.

For a subgroup, carry out a similar argument with \(H_i = G_i\cap H\), where you’ll use the natural inclusion of \(H_{i+1}/H_i\) into \(G_{i+1}/G_i\).

In fact, we can check solvability on pieces:

Corollary 3 Let \(G\) be a group an \(N\) a normal subgroup. Then \(G\) is solvable if and only if \(N\) and \(G/N\) are solvable.

Proof. The forward direction follows directly from the theorem. Conversely, suppose \(N\) and \(G/N\) are solvable. Using the same subgroup-quotient-correspondence, we can take a normal tower in \(G/N\) and lift it back to a normal tower of subgroups of \(G\) containing \(N\) with almost all quotients abelian except for the bottom quotient \(N/\{\id\}\). Fortunately, \(N\) is solvable, you can append the solvable tower for \(N\) to get an entire solvable tower for \(G\).

Corollary 4 Let \(G\) be a group. The following are equivalent:

\(G\) has a normal tower with successive quotients

  1. solvable,
  2. abelian, (definition of solvable)
  3. cyclic,
  4. of prime order.

Proof. Clearly (4) implies (3) implies (2), and induction on Corollary 3 says (2) is equivalent to (1). Assuming (2), we would like to “refine” the tower into form (4).

We will induct on the order of the solvable group. Start from a tower \[\{\id\} = G_0 \triangleleft G_1 \triangleleft ... \triangleleft G_{n-1} \triangleleft G_n = G\]

Let \(N\) be the last \(G_i\) which is a proper subgroup of \(G\). This gives us a short tower \[\{\id\} \triangleleft N \triangleleft G.\]

If \(N\) is nontrivial, then \(|N|\) and \(|G/N|\) are both strictly less than \(|G|\), so we can find normal towers in each of \(N\) and \(G/N\) whose quotients have prime order; combine the \(N\) and \(G/N\) towers as in Corollary 3 to a tower for \(G\) with prime-order quotients.

Otherwise, \(N\) is trivial and so it must be that \(G\cong G/N\) is abelian. In this case, let \(g\) be any nontrivial element of \(G\). Taking a power of \(g\) if necessary, we may assume it has prime order. Then apply the argument to the short tower \[\{\id\} \triangleleft \langle g \rangle \triangleleft G.\]

Here, the fact that \(G\) is abelian is what tells us \(\langle g \rangle\) is a normal subgroup. The only special case is \(G = \langle g \rangle\), in which case there is nothing to do; \(G\) is a cyclic group of prime order, so \(\{\id\} \triangleleft G\) is already the desired tower.

Assorted Invariants

In this section, we assume every extension/element/polynomial is separable. This isn’t strictly necessary but spares us a few technicalities.

Discriminant

The discriminant is extremely important in number theory. For us, it determines whether or not a Galois group possesses odd permutations, when identified with a subgroup of \(S_n\).

Definition 7 Let \(\alpha\) be algebraic over \(k\) with minimal polynomial \(f(x) \in k[x]\). Enumerate the roots of \(f\) as \(\alpha=\alpha_1,\alpha_2,...,\alpha_d\). Then the discriminant of \(\alpha\) (over \(k\)) is

\[D(\alpha) = (-1)^{\binom{d}{2}}\prod_{i\neq j} \alpha_i - \alpha_j\]

Swapping \(i\) and \(j\) will change the sign of \(\alpha_i - \alpha_j\), so this is a start at identifying the odd permutations, but it’ll also change the sign of \(\alpha_j-\alpha_i\). This is intentional, however, because it ensures \(D\) is in the ground field, giving us the following test for odd permutations:

Proposition 1 Given \(\alpha\) algebraic over \(k\), its discriminant \(D(\alpha)\) is in \(k\). Moreover, \(x^2=D\) has a solution in \(k\) if and only if every element of the splitting field of \(\irr_{\alpha,k}\) is even as a permutation of the conjugates of \(\alpha\).

Proof. Work in the splitting field \(K/k\), and let \(G\) be its Galois group. We’ll show that \(D=D(\alpha)\) is fixed by every \(\sigma \in G\), which means it is in \(k\) by the Galois correspondence. Indeed:

\[D^\sigma = (-1)^{\binom{d}{2}}\prod_{i\neq j} \alpha_i^\sigma\]

which has all the same terms as \(D(\alpha)\), just in a possibly different order.

The following is a square root of \(D\)

\[\delta = \prod_{i<j} \alpha_i - \alpha_j,\]

which you can see by writing out the product on a grid and factoring out \(\binom d 2\) minus signs below the diagonal.

This square root will be in \(k\) if and only if it’s fixed by every \(\sigma\) in the Galois group. We are identifying \(G\) with a subgroup of \(S_n\) by its action on the \(\alpha_i\), and applying \(\sigma\) rearranges the product and introduces negative signs any time it reverses the order of \(\alpha_i\) and \(\alpha_j\). The number of such inversions is exactly what identifies a permutation as even or odd.

Therefore, \(\delta^\sigma = -\delta\) if and only if \(\sigma\) is an odd permutation. Since \(\delta \neq 0\) and so \(\delta \not \in k\) if and only if \(G\) has odd permutations.

You probably remember hearing the word discriminant when you learned about quadratic polynomials many years ago. For a minimal polynomial of the form \(x^2 + bx + c\), the discriminant is \[b^2 - 4c\] which is the one you’re familiar with. To calculate it, let \(\alpha_1\) and \(\alpha_2\) be the roots. Expanding \((x-\alpha_1)(x-\alpha_2)\) and comparing, we see that \(\alpha_1 + \alpha_2 = -b\) and \(\alpha_1\alpha_2 = c\).

\[\begin{align*} D &= -(\alpha_1 - \alpha_2)(\alpha_2-\alpha_1) \\ &= \alpha_1^2 + \alpha_2^2 - 2\alpha_1\alpha_2\\ &= -b\alpha_1 - c - b\alpha_2 - c - 2c\\ &= -b(\alpha_1 + \alpha_2) - 4c\\ &= b^2 - 4c\\ \end{align*}\]

Another that you’re less likely to have seen is the discriminant of \(x^3 + ax + b\). These are called depressed cubics, and any cubic can be made depressed by the substitution \(x\to x-c/3\) if \(c\) is the coefficient of \(x^2\). This discriminant is

\[-4a^3 - 27b^2,\]

which can be determined by a straightforward but lengthy calculation.

The discriminant of a cubic lets us quickly identify some Galois groups.

Proposition 2 Let \(f(x) = x^3 + ax + b\) be a polynomial in \(k\). If \(f\) is irreducible, then the Galois group of its splitting field is isomorphic to \(S_3\) if \(-4a^3 - 27b^2\) is not a square, and to \(A_3\) otherwise.

Proof. Since it’s irreducible, the Galois group is a subgroup of \(S_3\) of order at least \(3\), so the only possibilities are \(A_3\) and \(S_3\). These are distinguished precisely by the presence of an odd permutation, which is detected by that discriminant.

This suggests that irreducible cubics with Galois group smaller than \(S_n\) are rare. The values of \(-4a^3 -27b^2\) are fairly sparse, and the set of squares is sparse too, so it would take an unusual coincidence for them to overlap. There are examples, though, such as \[x^3 - 3x + 1\] over \(\mathbb Q\). It is irreducible by Eisenstein at \(p=3\), and its discriminant is \(81 = 9^2\), so its Galois group is cyclic of order 3.

On the other hand, we do know how to manufacture many such cubics: if \(p=1\) mod \(3\), then \(\mathbb Q(\zeta_p)\) is a cyclic extension of degree divisible by \(3\), and hence has a subgroup of index \(p\). The fixed field of that subgroup is a cyclic extension of \(\mathbb Q\) of degree \(3\), so the minimal polynomial of any generator for that extension is irreducible with square discriminant.

In fact, every such cubic arises in this fashion, a consequence of the Kronecker-Weber theorem (one of the most delightful early results in algebraic number theory).

Norm and Trace

The norm and trace introduce some linear-algebraic invariants of a field extension.

Definition 8 Given an extension \(K/k\), we define the norm from \(K\) to \(k\) and the trace from \(K\) to \(k\) as the following functions from \(K\) to \(k\):

\[\begin{align*} N^K_k (\alpha) &= \prod_{\sigma:K\to \tilde K} \alpha^\sigma\\ Tr^K_k (\alpha) &= \sum_{\sigma:K\to \tilde K} \alpha^\sigma\\ \end{align*}\]

where the sum and product are over distinct embeddings of \(K\) into its Galois closure (or into any normal extension containing \(K\) - like the algebraic closure).

These expressions are symmetric, so we can use Galois theory to show that they’re in \(k\).

Proposition 3 The norm is a homomorphism from \(K^*\) to \(k\) and the trace is a homomorphism from \(K^+\) to \(k^+\). If \(\alpha \in k\), then \[N^K_k(\alpha) = \alpha^{[K:k]}\ \ \ \ \textrm{ and }\ \ \ \ Tr^K_k(\alpha) = [K:k]\alpha.\]

Both \(N^K_k\) and \(Tr^K_k\) take values in \(k\).

Proof. Since the \(\sigma\) are field homomorphisms, the maps are clearly homomorphisms of the respective groups. When \(\alpha \in k\), we know \(\alpha^\sigma = \alpha\) for all \(\sigma\), so the sum/product is \([K:k]\) copies of the same element.

For the last claim, we will use the Galois correspondence in the same way we did for the discriminant. Let \(\tau\in \Gal(\tilde K/k)\) and examine

\[N^K_k (\alpha)^\tau = \prod_{\sigma:K\to \tilde K} \alpha^{\sigma\tau}.\]

Each \(\sigma\tau\) is still an embedding of \(K\) into \(\tilde K\), and they’re distinct because \(\Gal(\tilde K/k)\) is a group. Therefore we get the same product, just reordered. The same reasoning applies to the trace.

In terms of Galois theory, embeddings of \(K\) into \(\tilde K\) correspond to cosets of \(\Gal(\tilde K/K)\).

If \(K=k(\alpha)\) then the LLs tell us that embeddings \(\sigma\) correspond to choices of conjugates of \(\alpha\). The Galois group just permutes conjugates, and it leaves the product and sum of all the conjugates invariant.

That also gives us a concrete way to calculate the norm and trace in this situation: the minimal polynomial is \[f(x) = \prod_{\sigma:K\to \tilde K} x - \alpha^\sigma\] so the constant term is \((-1)^{[K:k]}N^K_k(\alpha)\) and the second highest degree term is \(-Tr^K_k(\alpha)\). Be careful that this is only for the case \(K=k(\alpha)\).

Lemma 5 If \(K/L/k\) is a tower, then we have

\[\begin{align*} N^K_k(\alpha) &= N^L_k \circ N^K_L (\alpha),\\ Tr^K_k(\alpha) &= Tr^L_k \circ Tr^K_L (\alpha). \end{align*}\]

Proof. Let \(G = \Gal(\tilde K/k)\), \(H_K = \Gal(\tilde K/K)\), and \(H_L = \Gal(\tilde K/L)\). We have a tower, so \(H_K\subseteq H_K\). As was pointed out above, each embedding of \(K\) into \(\tilde K\) over \(k\) corresponds to a coset of \(G/H_K\), and an embedding of \(K\) into \(\tilde K\) over \(L\) corresponds to a coset of \(H_L/H_K\). Cosets behave well in towers, so if we pick representatives \(\sigma_1,...,\sigma_n\) for \(G/H_L\) and \(\tau_1,...,\tau_m\) for \(H_L/H_K\), we know that \(\tau_i\sigma_j\) are representatives for \(G/H_K\). Then it’s just a matter of regrouping the product

\[\begin{align*} N^K_k(\alpha) &= \prod_{g \in G/H_K} \alpha^g\\ &= \prod_{\tau_j\sigma_i \in G/H_K} \alpha^{\tau_j\sigma_i}\\ &= \prod_{\sigma_i} \left(\prod_{\tau_j} \alpha^{\tau_j}\right)^{\sigma_i}\\ &= \prod_{\sigma_i} N^K_L(\alpha) ^{\sigma_i}\\ &= N^L_k(N^K_L(\alpha))\\ \end{align*}\]

The trace is identical.

Proposition 4 Consider an extension \(K/k\). Each \(\alpha \in K\) gives rise to a map

\[\begin{align*} m_\alpha:K &\to K\\ x &\mapsto \alpha x \end{align*}\]

This map is \(k\)-linear. Moreover, \(N^K_k(\alpha) = \det m_\alpha\) and \(\Tr^K_k(\alpha) = Tr m_\alpha\).

Proof. First suppose \(K=k(\alpha)\). Notice that for any polynomial \(p(x)\in k[x]\), we have that \[p(m_\alpha) = m_{p(\alpha)}\] and therefore the minimal polynomial of \(\alpha\) over \(k\) is also the minimal polynomial of the linear operator \(m_\alpha\). Comparing degrees, we see that they coincide, so the claim follows from the relationship between trace/norm and the roots of the minimal polynomial, which we now know to be the conjugates of \(\alpha\).

If \(K\neq k(\alpha)\) then we have a tower \(K/k(\alpha)/k\). By Lemma 5, we know

\[N^K_k(\alpha) = N^{k(\alpha)}_k(N^K_{k(\alpha)}(\alpha))\]

Since \(\alpha \in k(\alpha)\), the inner term is \(\alpha^{[K:k(\alpha)]}\), then we use multiplicativity of norm and the fact just proven to obtain

\[N^K_k(\alpha) = (\det M_\alpha )^{[K:k(\alpha)]},\]

where \(M_\alpha\) is the multiplication-by-\(\alpha\) map on \(k(\alpha)\), not \(m_\alpha\), the multiplication-by-\(\alpha\) map on \(K\).

To conclude, observe that using the basis from proving the multiplicativity in degree for towers puts \(m_\alpha\) in block diagonal form, with \([K:k(\alpha)]\) copies of \(M_\alpha\) along the diagonal. Therefore,

\[\det M_\alpha = (\det m_\alpha)^{[K:k(\alpha)]}\]

which, combined with the previous inequality, is what we set out to show.

As usual, the argument for the trace is similar.

Upon picking a basis for \(K/k\), you can write a matrix representing \(m_\alpha\) and use this to calculate the norm and trace without having to find field embeddings.

The discriminant can be realized as a norm, which sometimes makes it easy to calculate.

Proposition 5 Let \(\alpha\) be algebraic over \(k\) with minimal polynomial \(f\) of degree \(d\). The discriminant of \(\alpha\) is a norm:

\[D(\alpha) = (-1)^{\binom d 2} N^{k(\alpha)}_k (f'(\alpha)).\]

Proof. Writing \(f(x) = \prod x - \alpha^\sigma\) and expanding \(f'(x)\) by the product rule, we can see that \[f'(\alpha) = \prod_{\sigma \neq \id} \alpha - \alpha^\sigma\]

Therefore, the product of all \(f'(\alpha)^\tau\) is the product of all \(\alpha' - \alpha''\) distinct conjugates of \(\alpha\). This is precise the product appearing in the definition of the discriminant.

Corollary 5 Let \(\zeta\) be a primitive \(p\)th root of unity for \(p\) an odd prime. The discriminant of \(\zeta\) is

\[ (-1)^{\binom {p-1}2} p^{p-2}\]

Proof. Several times we use that \(p-1\) is even when \(p\) is odd. The minimal polynomial of \(\zeta\) is

\[\frac{x^p - 1}{x - 1},\]

which has derivative

\[f'(x) = \frac{p(x-1)x^{p-1} - (x^p-1)}{(x-1)^2}\]

Evaluated at \(\zeta\), this simplifies to

\[f'(\zeta) = \frac{p \zeta^{p-1}}{\zeta - 1}.\]

Taking the norm and using multiplicativity, we get

\[N(f'(\zeta)) = \frac{p^{p-1} (-1)^{p(p-1)}}{N(\zeta - 1)}\]

To simplify the remaining norm, notice that \[N(\zeta-1) = (-1)^{p-1} N(1-\zeta) = \prod 1 - \zeta^\sigma\] is the same as \(f(1) = p\) (which we used when proving that \(\Phi_p\) is eisenstein!). Using the proposition, we conclude

\[D(\zeta) = (-1)^{\binom {p-1} 2} p^{p-2}\]

A very useful consequence is that, the discriminant is not a square and \(\mathbb Q(\zeta)\) contains \(\sqrt {\pm p}\) where the sign is \(+\) if \(p=1\) mod \(4\) and \(-\) if \(p=3\) mod \(4\). This leads to a very natural proof of quadratic reciprocity once you learn a little algebraic number theory. Even ignoring the congruence condition, we see that any extension containing \(i\) and \(\zeta_p\) contains \(\sqrt p\), so \(\QQ(\zeta_{4p})\) has a square root of \(p\).

Independence of Characters

The trace map gives rise to a bilinear pairing from \(K\times K\) to \(k\), \[\langle x,y \rangle = Tr^K_k(xy).\]

Any pairing induces a linear map to the dual, and that map is an isomorphism when the pairing is perfect. Even though any finite-dimensional vector space is isomorphic to its dual for dimension reasons, the map is not canonical. Getting such an isomorphism from a natural object like the trace would be a huge improvement.

To show that it’s perfect, we use the following lemma, which is a significant and interesting result in its own right called the linear independence of characters.

Lemma 6 Let \(G\) be a group, \(K\) be a field, and \(\{\chi_i\}\) a collection of distinct multiplicative homomorphisms from \(G\) to \(K^\times\) (these are called characters). The \(\chi_i\) are linearly independent over \(K\), meaning

\[\sum_i a_i \chi_i = 0\]

as a function from \(G\) to \(K\) implies all the \(a_i\) are zero. (the sum is finitely supported)

Proof. Note that a single \(\chi\) is linearly independent, \(\chi(\id) = 1\).

Now take a linear dependence with the smallest possible number of terms. Reorganize it as

\[\sum_{i=1}^n a_i \chi_i(g) = 0\]

for all \(g\). Minimality ensures that every proper subset of \(\sigma_1,...,\sigma_n\) is linearly independent, and moreover that the \(a_i\) are all nonzero so we may assume \(a_n = 1\).

Since the maps are distinct, we know \(\chi_1\neq \chi_n\) and hence can find an \(h\) such that \(\chi_n(h) \neq \chi_1(h)\). Make a substitution

\[0 = \sum_{i=1}^n a_i \chi_i(hg) = \sum_{i=1}^n a_i \chi_i(h) \chi_i(g).\]

Now take the first expression and subtract \(\chi_n(h)\) times it from this one to obtain

\[ 0 = \sum_{i=1}^n a_i \chi_i(h) \chi_i(g) - a_i \chi_n(h) \chi_i(g) = \sum_{i=1}^n (a_i \chi_i(h) - a_i \chi_n(h)) \chi_i(g),\]

a new linear dependence with coefficients \(a_i\chi_i(h) - a_i\chi_n(h)\). However, our choices ensure the top coefficients cancel and the bottom coefficients don’t, so this is a strictly smaller linear dependence, contrary to our assumption of minimality.

Corollary 6 The trace pairing is perfect.

Proof. We want to show that for a fixed nonzero \(y\), there is some \(x\) such that \(Tr(xy) \neq 0\). As a function of \(x\),

\[Tr(xy) = \sum_\sigma (xy)^\sigma = \sum y^\sigma (x)^\sigma.\]

The right hand side is a nonzero \(\tilde K\)-linear combination of the characters \(x\to x^\sigma\) from \(K^\times\) to \(\tilde K^\times\). So by Lemma 6 this function is nonzero, meaning there’s an \(x\) so that \(Tr(xy) \neq 0\).

We also get a nice criterion for linear independence

Lemma 7 Let \(K/k\) be a finite extension and \(\sigma_1,...,\sigma_n\) its embeddings into \(\tilde K\). If \(\alpha_1,...,\alpha_n \in K\) is a basis for \(K\) over \(k\), then the vectors

\[v_i = (\alpha_1^{\sigma_i}, \alpha_2^{\sigma_i}, ... , \alpha_n^{\sigma_i})\]

are a basis for \(K^n\) over \(K\).

Proof. Suppose the \(v_i\) were \(K\)-linearly dependent. Then one could find \(\beta_i \in \tilde K\) such that \[0 = (\sum_i \beta _i v_i)_j = \sum_i \beta_i\alpha_j^{\sigma_i} \]

Therefore, each associated sum of characters \(\beta_i ( \cdot ) ^\sigma_i\) vanishes on all the \(\alpha_i\). Not only that, these maps are \(k\)-linear, so this means they vanish on all of \(K\), so by independence of characters this implies all the \(\beta_i\) are zero.

A (somewhat distant) consequence of this lemma is the normal basis theorem, which says that a Galois extension \(K/k\) has an element \(\alpha\) such that the conjugates of \(\alpha\) also form a \(k\)-basis. In other words, one can make a change of basis so that the Galois group acts as permutation matrices on \(K\) as a \(k\)-vector space.

The proof of this fact uses some ring-theoretic tools that are beyond the scope of the prerequisites of this course, but it’s good to know it is available. For example, when you have a normal basis, fixed fields come right out of traces:

Theorem 11 Let \(K/k\) be Galois with Galois group \(G\). Suppose there is some \(\alpha \in K\) such that the conjugates of \(\alpha\) are a basis for \(K\) over \(k\).

Then if \(H\) is a subgroup of \(G\), the fixed field of \(K\) is generated by \[\beta = \sum_{h\in H} \alpha^h\]

(which you can think of as “\(\Tr^{e}_H(\alpha)\)” by interpreting the correspondence correctly).

Proof. It’s clear that this sum is in \(K_H\) because applying automorphisms in \(H\) just permutes its terms.

To show that it generates the field extension, we argue by degree: using the fact that the \(G\)-conjugates of \(\alpha\) are linearly independent, it is not difficult to show that the conjugates of \(\beta\) given by choosing coset representatives for \(H\) are also linearly independent, and there are \([G:H]\) of them. So the field extension generated by \(\beta\) has degree at least \([G:H] = [K_H:k]\), hence exactly that degree and the extensions coincide.

In reality, it can be difficult to compute a normal basis.

Radical Extensions

We’ve proven that an extension which is expressible by radicals has a Galois closure with a solvable Galois group as follows:

  1. The Galois closure is expressible by radicals (straightforward)
  2. From group theory (quotients of solvable are solvable) we can make the Galois closure bigger if necessary
  3. So expand it to include all “relevant” roots of unity, and pass to a “stepwise-simple” radical tower.
  4. Any tower gives rise to a sequence of subgroups, which is a good start…
  5. If necessary, rearrange to put the roots of unity first
  6. Check that each step is normal (roots of unity or \(x^n-c\) with the \(n\)th roots of unity in the ground field.
  7. Check that each step is abelian (subgroups of \((\mathbb Z/n\mathbb Z)^*\) resp \((\mathbb Z/n\mathbb Z)^+\)… so the tower witnesses solvability.

Now we want to show that if the Galois group is solvable, then the extension is expressible by radicals.

Theorem 12 Let \(K/k\) be a Galois extension of fields with Galois group \(G\). Assume that the characteristic of \(G\) doesn’t divide \(|G|\). If the Galois group \(G\) is solvable, then \(K/k\) is expressible by radicals.

Proof. First, we adjoin \(\zeta\), a primitive \(|G|\)th root of unity. As we’ve seen many times now, the Galois group of \(K(\zeta)/K\) embeds into the Galois group of \(K/k\) and hence remains solvable.

Roots of unity are allowed, so this will be the first step in our radical tower and we may now assume that \(k\) contains all roots of unity of orders dividing \(|G|\), and in particular all \(p\)th roots of unity for \(p\) dividing \(|G|\).

Next, by our group theory we may take a stepwise normal series for \(G\) with cyclic quotients of prime order. Those orders divide the order of \(|G|\). Taking fixed fields, this series gives rise to a tower of extensions \(K_{i+1}/K_i\) where each extension is cyclic of order \(p\).

This reduces the problem to the following lemma, which says that when \(K/k\) is cyclic of order \(p\) and \(k\) contains the \(p\)th roots of unity that \(K=k(\alpha)\) for some \(\alpha\) whose minimal polynomial is of the form \(x^p - a\).

The lemma is the field-theoretic core of the theorem. This is a special case of what’s known as “Hilbert’s Theorem 90”.

Lemma 8 Let \(K/k\) be a Galois extension with cyclic Galois group of order \(n\) not divisible by the characteristic of \(k\). Assume \(k\) contains all \(n\)th roots of unity. Then \(K=k(\alpha)\) and \(\alpha\) is a root of \(x^n - a\) for some \(a\in k\).

Proof. As we’ve seen before, simple radical extensions like this have a Galois group generated by \(\sigma:\alpha \mapsto \zeta \alpha\). Conversely, if we find some nonzero \(\alpha\) so that \(\alpha^\sigma = \zeta \alpha\), where \(\sigma\) is a generator of the Galois group, then its minimal polynomial must be of this form. So we’ll look for \(\alpha\) directly.

Consider the following linear combination of characters: \[S = \zeta^n\id + \zeta^{n-1} \sigma + \zeta^{n-2} \sigma^2 + ... + \zeta \sigma^{n-1}\]

Since \(\zeta\) is in the ground field, this transforms nicely with respect to \(\sigma\) because \(\zeta^n = 1\) and \(\sigma^n = \id\),

\[\begin{align*} S^\sigma &= \zeta^n \sigma + \zeta^{n-1}\sigma^2 + ... + \zeta \sigma^n &= \zeta^n \sigma + \zeta^{n-1}\sigma^2 + ... + \zeta \id &= \zeta S \end{align*}\]

Therefore \(S(\beta)^\sigma = \zeta S(\beta)\) for any \(\beta \in K\), and so \(\alpha = S(\beta)\) is the element we want… we just need to be sure that \(S(\beta)\neq 0\). But that’s no problem, because \(S\) is a linear combination of characters, and therefore cannot be identically zero!

Different variations on \(S\) appear in other places, I like this one because it is homogenous in the exponents.

Number Theory!!!!

Here are a few neat connections to number theory. The proofs are beyond the scope of this course, but hopefully you’ll get to take algebraic number theory some time soon.

Linear Algebra

We’ll use the following handy lemma a few times.

Lemma 9 Let \(V\) be a vector space over an infinite field \(K\). Then \(V\) is not the union of a finite number of proper subspaces.

Proof. Suppose, for contradiction, that \(V\) can be written as such a union, and let \(W_1,..., W_n\) be a minimal collection of proper subspaces of \(V\) whose union is \(V\). Enlarging \(W_1\) if necessary, we may further assume that \(W_1\) has codimension \(1\).

By minimality, we can take some \(w\in W_1\) which is not in \(W_j\) for any \(j\neq i\). By properness of \(W_1\), we can also find some \(v\) not in \(W_1\). Consider sums \[v + aw\] for \(a\in K\). Since \(K\) is infinite, there are infinitely many such sums, and because \(v\) is not in \(W_1\), none of them are in \(W_1\), so some \(W_j\) contains \(v+aw\) and \(v+bw\) for some \(a\neq b\)… but then \((a-b)w\) is in \(W_j\), hence \(w\) is in \(W_j\), a contradiction!

Corollary 7 If an extension \(K/k\) has only finitely many sub-extensions, then \(K/k\) is a primitive extension.

Conversely, if \(K/k\) is primitive, then it has finitely many subfields.

Proof. There is nothing to prove if \(k\) is finite, so assume it is infinite.

A proper subextension field is also a proper subspace of \(K\) as a vector space over \(k\). If there are only finitely many proper subfields, then their union cannot cover \(K\). By Lemma 9, there is some \(\alpha\) in \(K\) not contained in any of those proper subextensions, and so \(k(\alpha)\) must be all of \(K\).

In the other direction, suppose \(K = k(\alpha)\) with minimal polynomial \(f(x)\). Let \(L\) be an intermediate extension and let \(g(x)\) be the minimal polynomial of \(\alpha\) over \(L\). Since \(K/k\) must be finite, so too is \(K/L\). Observe that the extension \(L'\) obtained by adjoining the coefficients of \(g\) to \(k\) is a subfield of \(L\), but the degree of \(\alpha\) over \(L'\) is at most that of \(\alpha\) over the larger field \(L\), so \(L=L'\). But there are only finitely many such \(L'\) because \(f(x)\) has finitely many divisors.

Glossary and Conventions

Here is a summary of the main terms, plus some notational conventions.

  1. Rings are as usual, and we allow the zero ring, in which \(0=1\). Almost all rings are commutative. Common letters are \(R\) (for “ring”) and \(A\) (for “anneau” in French).

  2. Fields are commutative rings whose nonzero elements form a group under multiplication – so \(0\neq 1\) in every field. Typical letters are \(k\) and \(K\) (from “korper” in German) as well as \(E,L\).

  3. An extension of fields is denoted \(K/k\) and means that \(k,K\) are fields and \(K\) contains \(k\). A subextension is a subfield \(L\) of \(K\) which also contains \(k\). Many definitions involving extensions will include the words “over \(k\)”, and if something is missing those words and doesn’t make sense, it was probably an accidental omission.

  4. If \(K/k\) and \(L/k\) are extensions, an embedding of \(K\) into \(L\) over \(k\) is a field homomorphism from \(K\) to \(L\) which fixes \(k\) pointwise. Every field homomorphism is injective (in particular, they preserve degrees and are isomorphisms onto their images).

  5. Multiple extensions, like \(K/E/k\) or \(L/K/E/k\) are called towers. Degree multiplies in towers.

  6. If \(E\) and \(K\) are subfields of some larger field, their compositum \(EK\) is the smallest subfield containing both. In the latter case, if \(E\) and \(K\) are both extensions of \(k\), then \(EK/K/k\) and \(EK/E/k\) are both towers and the situation can be drawn as a “diamond diagram”.

  7. Many inductions proceed by splitting \(L/k\) into a tower \(L/K\) and \(K/k\), this is called dévissage (sometimes).

  8. Polynomial rings over a ring \(R\) are written \(R[x]\). Keep in mind that these are formal polynomials. It’s not the same as the ring of polynomial functions from \(R\) to \(R\).

  9. Polynomial rings are the natural domain for evaluation homomorphisms. If \(\phi:R\to S\) is a ring homomorphism, then it induces a map \(R[x]\to S[x]\) sending \(f(x)\) to \(f^\phi(x)\) by applying \(\phi\) to each coefficient. If you’re further given some \(s\in S\), the evaluation homomorphism \(\ev_{\phi,s}:R[x] \to S\) \[f(x) \mapsto f^\phi(s).\] In words, “$_{,s} is the evaluation homomorphism from \(R[x]\) to \(S\) at \(s\) over \(\phi\)“. The prepositional phrases can go in any order. The first one may be dropped when clear from context. Also, it’s often the case that \(R\subseteq S\) and \(\phi\) is the identity, in which case we just say”the evaluation homomorphism at \(s\)“. The image is written \(R[s]\) in that case, or \(\phi(R)[s]\) more generally.

  10. An extension \(K/k\) allows us to view \(K\) as a \(k\)-vector space. The degree, denoted \([K:k]\) is the dimension, and we say an extension is finite if the degree is finite.

  11. We call an element \(\alpha \in K\) from an extension \(K/k\) algebraic if \(\alpha\) is a root of a nonzero polynomial in \(k[x]\) – equivalent to \(k(\alpha)/k\) being finite, or \(k(\alpha) = k[\alpha]\). A non-algebraic element is called transcendental (over \(k\)). The extension \(K/k\) is called algebraic if every element in it is algebraic (over \(k\)).

  12. We saw that \(k[x]\) is a PID, so an irreducible polynomial \(f(x)\) generates a maximal ideal, and \(k[x]/(f(x))\) cn be viewed as a field extension of \(k\) in which \(f\) has a root.

  13. If \(\alpha \in K/k\) is algebraic, its minimal polynomial (over \(k\)), written \(\irr_{k,\alpha}\) is the (unique) nonzero monic polynomial \(f(x)\) in \(k[x]\) of least degree such that \(f(\alpha) = 0\). Minimal polynomials are irreducible, and every irreducible polynomial is a minimal polynomial (of any of its roots in some field extension). The minimal polynomial divides every \(f(x)\) in \(k[x]\) which has \(\alpha\) as a root. The other roots of \(\irr_{k,\alpha}\) are called conjugates of \(\alpha\).

  14. There are a few lifting lemmas based on the isomorphisms \(k(\alpha) \cong k[x]/f(x) \cong k(\alpha')\) over \(k\), where \(\alpha\) and \(\alpha'\) are conjugate. Know these.

  15. A polynomial in \(K[x]\) is said to split if it factors completely into linear factors. An extension \(K/k\) is called a splitting field over \(k\) if there is a family of polynomials \(\{g_\alpha(x)\}\) in \(k[x]\) which split in \(K\) and not in any proper subextension. We almost always restrict to finite extensions and finite families – the infinite cases often require technical fiddling with Zorn’s lemma.

  16. An extension \(K/k\) is called normal if it satisfies any of conditions (N1), (N2), (N2’), (N3); we won’t construct algebraic closures, so technically we only define finite normal extensions and ignore (N2’). Condition (N1) is just being a (finite) splitting field, and condition (N2) is called the embedding property (for normal extensions) and says that every re-embedding of a normal extension comes from an automorphism. Normal extensions are unique up to isomorphism.

  17. One can show that the intersection and compositum of normal extensions of \(k\) is again normal. The normal hull (sometimes “normal closure”) of an extension \(K/k\) is the smallest normal extension of \(k\) containing \(K\); fix a normal extension \(L/k\) containing \(K\), and the normal hull is the intersection of all normal \(E/k\) which have \(K/k\) as a subextension.

  18. Given some algebraic \(\alpha \in K/k\), we call it separable if its minimal polynomial \(f(x)\) has no repeated roots. Equivalently, \(f(x)\) and \(f'(x)\) have no common roots, which can be tested by verifying \(f'(x) \neq 0\). Otherwise, it is called inseparable.

  19. Given an extension \(K/k\) with at least one embedding into a normal extension \(L/k\), the separable degree of \(K/k\), denoted \([K:k]_s\) is the number of embeddings of \(K\) into \(L\) over \(k\). This does not depend on \(L\). The embeddings of \(k(\alpha)\) in to any normal extension over \(k\) are given by \(\alpha \mapsto \alpha'\) as \(\alpha'\) varies over the \(k\)-conjugates of \(\alpha\). Therefore, \([k(\alpha):k]_s\) is the number of distinct roots of \(\irr_{k,\alpha}\) and hence \([k(\alpha):k]_s = [k(\alpha):k]\) when \(\alpha\) is separable; in general \([K:k]_s \leq [K:k]\). Separable degrees multiply in towers.

  20. A finite extension \(K/k\) is called separable if it satisfies any of conditions (S1), (S2), (S2’), (S3). As above, we avoid (S2’) because we haven’t constructed algebraic closures. Condition (S2) is called the embedding property (for separable extensions) and says that they have as many embeddings as possible. The implication (S3) implies (S1) is called the primitive element theorem.

  21. For an extension \(K/k\), a homomorphism \(\sigma: K \to K\) is called an automorphism over \(k\) if it fixes \(k\) elementwise. The set of automorphisms of \(K\) over \(k\) is denoted \(\Aut(K/k)\), and forms a group under composition. The action of \(\Aut(K/k)\) on elements of \(K\) with exponentiation, meaning a right action: \[\alpha^\sigma = \sigma(\alpha)\] for \(\alpha \in K\) and \(\sigma \in \Aut(K/k)\).

  22. A finite extension \(K/k\) is called Galois if it is both normal and separable. In this case, \(\Aut(K/k)\) is called the Galois group of \(K\) over \(k\), and usually written as \(\Gal(K/k)\) instead. Normality and separability tell us that \(|\Gal(K/k)| = [K:k]\). The LL and NLL tell us that for any \(\alpha\) in \(K\), \(\Gal(K/k)\) acts transitively on its conjugates.

  23. The Fundamental Theorem of Galois Theory has two parts. Let \(K/k\) be Galois with Galois group \(G\). Part I of the FTGT says there is a one-to-one order-reversing correspondence between subextensions of \(K/k\) and subgroups of \(G\) as follows:

    • \(L \mapsto \Gal(K/L)\), the subgroup of elements in \(G\) that fix \(L\) pointwise,
    • \(H \mapsto K_H\), the subfield of \(K\) fixed pointwise by every element in \(H\).

    Part II of the FTGT describes how automorphisms move subfields/subgroups. Using Part I, the following two statements of Part II are equivalent: \[\begin{align*} K_{H^\sigma} &= (K_H)^\sigma\\ \Gal(K/L)^\sigma &= \Gal(K/L^\sigma) \end{align*}\] where \(H^\sigma\) means the conjugate subgroup \(\sigma\inv H \sigma\) and \(L^\sigma\) means the obtained by applying \(\sigma\) to every element of \(L\) (as usual, we use right actions). In particular, a normal subgroup of \(G\) corresponds to a normal extension of \(k\) contained in \(K\), and moreover \(\Gal(L/k) \cong \Gal(K/k)/\Gal(L/k)\) in this case.

  24. Given a separable extension \(K/k\), contained in some Galois extension \(L/k\), we can see that the \([K:k]\) distinct embeddings of \(K\) into \(L\) are in 1-1 correspondence with the cosets of \(\Gal(L/K)\) inside \(\Gal(L/k)\). This is often applied when \(L=\tilde K\) is the Galois closure of \(K\) over \(k\). The Galois closure is just the compositum of all the conjugates of \(K\) inside \(L\) (or any other Galois extension containing \(K\).

  25. Continuing the above, automorphisms of \(K\) over \(k\) come from special embeddings of \(K\) into \(L\) over \(k\), those which take \(K\) to itself. These are in correspondence with the cosets of \(\Gal(L/K)\) inside its normalizer.