11

Formally Real Fields

In Chapter 5 of Basic Algebra I we studied polynomial equations, inequations, and inequalities in a real closed field. We defined such a field to be an ordered field in which every odd degree polynomial in one indeterminate has a root and every positive element has a square root. We proved that if R is real closed, then images is algebraically closed. The main problem we considered was that of developing an algorithm for testing the solvability of a system of polynomial equations, inequations, and inequalities in several unknowns in a real closed field. In the case of a single polynomial equation in one unknown, the classical method of J. C. F. Sturm provides a solution to this problem. We gave this method and developed a far-reaching extension of the method to the general case of systems in several unknowns. A consequence of this (also indicated in BAI) is Tarski’s theorem, which states roughly that a system of polynomial equations, inequations, and inequalities that has a solution in one real closed field has a solution in every such field (see BAI, p. 340, for the precise statement).

We now resume the study of real closed fields, but we approach these from a different point of view, that of formally real fields as defined by Artin and Schreier. The defining property for such a field is that – 1 is not a sum of squares in the field, or equivalently, if images = 0 for ai in the field, then every ai = 0. It is clear that any ordered field is formally real. On the other hand, as we shall see, every formally real field can be ordered. Hence a field is formally real if and only if it can be ordered.

The Artin-Schreier theory of formally real fields is an essential element in Artin’s solution of one of the problems posed by Hilbert at the 1900 Paris International Congress of Mathematicians. This was Hilbert’s seventeenth problem, which concerned positive semi-definite rational functions: Suppose f(x1,…,xn) is a rational expression in indeterminates xi with real coefficients such that f(a1,…,an) ≥ 0 for all (a1, …,an) where f is defined. Then is f necessarily a sum of squares of rational expressions with real coefficients? Artin gave an affirmative answer to this question in 1927. A new method of proving this result based on model theory was developed by A. Robinson in 1955. We shall give an account of Artin’s theorem. Our approach is essentially model theoretic and is based on the theorem of Tarski mentioned earlier.

Artin’s theorem gives no information on the number of squares needed to express a given f(x1,…, xn). Hilbert had shown in 1893 that any positive semi- definite rational function in two variables is expressible as a sum of four squares. In 1966 in an unpublished paper, J. Ax showed that any positive semi-definite function in three variables is a sum of eight squares and he conjectured that for such functions of n variables, 2n squares are adequate. This was proved in 1967 by A. Pfister by a completely novel and ingenious method. We shall give his proof.

We shall conclude this chapter with a beautiful characterization of real closed fields due to Artin and Schreier. It is noteworthy that the proof of this theorem initiated the study of cyclic fields of pe dimensions over fields of characteristic p.

11.1   FORMALLY REAL FIELDS

We recall that a field F is said to be ordered if there is given a subset P(the set of positive elements) in F such that P is closed under addition and multiplication and F is the disjoint union of P, {0}, and – P = {– p|pP} (see BAI, p. 307). Then F is totally ordered if we define a > b to mean abP. Moreover if a > b, then a + c > b + c for every c and ap > bp for every pP. If a ≠ 0, then a2 = ( – a)2 > 0. Hence if images = 0 in F, then every ai = 0. This is equivalent to: –1 is not a sum of squares in F. It follows that the characteristic of F is 0.

Following Artin and Schreier we now introduce the following

DEFINITION 11.1.   A field F is called formally real if – 1 is not a sum of squares in F.

It is clear that any field that can be ordered is formally real. The converse of this observation is a theorem of Artin and Schreier. We shall now prove this by an argument due to Serre that is based on the following

LEMMA 1.   Let P0 be a subgroup of the multiplicative group F* of a field F such that P0 is closed under addition and contains all non-zero squares. Suppose that a is an element of F* such that –a images P0. Then P1 = P0 + P0a = {b + ca|b, cP0} is a subgroup of F* closed under addition.

Proof.   Evidently P1 is closed under addition and if bi, ciPo for i = 1, 2, then (b1 + C1a)(b2 + c2a) = (b1b2 + c1c2a2) + (b1c2 + b2c1)aP1 since b1b2 + c1c2a2 and b1c2 + b2c1P0. We note next that P1 does not contain 0, since otherwise we have b + ca = 0 for b, cP0, which gives –a = bc– 1P0, contrary to the hypothesis on a. Also we have

images

since b(b + ca)– 2 and c(b + ca)– 2P0. Hence P1 is a subgroup of F*. images

Now let F be formally real and let P0 be the set of sums ∑ai2 with every ai ≠ 0. Evidently P0 is closed under addition and since (∑iai2)(∑jbj2) = ∑i,j(aibj)2, P0is closed under addition and multiplication is closed under multiplication. Moreover, P0 contains all of the non-zero squares and hence if a = ∑ai2 , ai ≠ 0, then a– 1 = aa– 2P0 Thus P0 satisfies the conditions of the lemma and so the set of subsets P′ satisfying these conditions is not vacuous. We can apply Zorn’s lemma to conclude that this set of subsets of F contains a maximal element P. Then it follows from the lemma that if a is any element of F*, then either a or – aP. Hence F = P ∪ {0} ∪ – P where – P ={ – p|pP} and since 0 images P and P is closed under addition, P ∩ – P = Ø and 0imagesP. Thus P, {0}, – P are disjoint and since P is closed under addition and multiplication, P gives an ordering of the field F. We therefore have the following

THEOREM 11.1.   Afield F can be ordered (by a subset P) if and only if it is formally real.

In BAI (p. 308) we defined a field R to be real closed if it is ordered and if (1) every positive element of R has a square root in R and (2) every polynomial of odd degree in one indeterminate with coefficients in R has a root in R. We showed that the ordering in a real closed field R is unique and that any automorphism of such a field is an order automorphism (Theorem 5.1, p. 308). Moreover, we proved an extension of the so-called fundamental theorem of algebra: If R is real closed, then images is algebraically closed (Theorem 5.2, p. 309). We shall now give a characterization of real closed fields in terms of formal reality:

THEOREM 11.2.   Afield R is real closed if and only if R is formally real and no proper algebraic extension of R is formally real

We separate off from the proof the following

LEMMA 2.   If F is formally real, then any extension field F(r) is formally real if either r = images for a > 0 in F or r is algebraic over F with minimum polynomial of odd degree.

Proof.   First, let r = images a > 0. Suppose that F(r) is not formally real. Then r images F and we have ai, biF such that – 1 = ∑(ai + bir)2. This gives ∑ai2 + ∑bi2 a = – 1. Since a > 0, this is impossible.

In the second case let f(x) be the minimum polynomial of r. Then the degree of f(x) is odd. We shall use induction on m = deg f(x). Suppose that F(r) is not formally real. Then we have polynomials gi(x) of degree < m such that ∑gi(r)2 = – 1. Hence we have ∑gi(x)2 = – 1 + f(x)g(x) where g(x) ∈ F[x]. Since F is formally real and the leading coefficient of gi(x)2 is a square, it follows that deg(– 1 + f(x)g(x)) = deg(∑gi(x)2) is even and < 2m. It follows that deg g(x) is odd and < m. Now g(x) has an irreducible factor h(x) of odd degree. Let s be a root of h(x) and consider F(s). By the induction hypothesis, this is formally real. On the other hand, substitution of s in the relation ∑gi(x)2 = – l + f(x)g(x) gives the contradiction ∑gi(s)2 = – 1.images

We can now give the

Proof of Theorem 11.2. Suppose that R is real closed. Then C = images is algebraically closed and C images R. Evidently C is an algebraic closure of Rand so any algebraic extension of R can be regarded as a subfield of C/R. Hence if it is a proper extension it must be C, which is not formally real since it contains images.

Next suppose that R is formally real and no proper algebraic extension of R has this property. Let aR be positive. Then Lemma 2 shows that R(images) is formally real. Hence R(images) = R and imagesR. Next let f(x) be a polynomial of odd degree with coefficients in R. Let g(x) be an irreducible factor of f(x) of odd degree and consider an extension field R(r) where g(r) = 0. By Lemma 2, R(r) is formally real. Hence R(r) = R. Then rRand f(r) = 0. We have therefore verified the two defining properties of a real closed field. Hence R is real closed. images

We shall show next that the basic property of a real closed field R that images images R and C = R(images) is algebraically closed characterizes these fields.

THEOREM 11.3.   A field R is real closed if and only if images images R and C = R(images) is algebraically closed.

Proof.   It suffices to show that if R has the stated properties, then R is real closed. Suppose that R satisfies the conditions. We show first that the sum of two squares in R is a square. Let a, b be non-zero elements of R and let ube an element of C such that u2 = a + bi, i = images. We have the automorphism x + iy images xiy, x, yR, of C/R whose set of fixed points is R. Now

images

and images. Thus a2 + b2 is a square in R. By induction, every sum of squares in R is a square. Since – 1 is not a square in R, it is not a sum of squares and hence R is formally real. On the other hand, since C is algebraically closed, the first part of the proof of Theorem 11.2 shows that no proper algebraic extension of R is real closed. Hence R is real closed by Theorem 11.2.images

EXERCISES

        1. Show that if F is formally real and the xi are indeterminates, then F(x1,…, xn) is formally real.

        2. Define an ordering for a domain D as for a field: a subset P of D such that P is closed under addition and multiplication and D is the disjoint union of P, {0}, and – P. Show that an ordering in a domain has a unique extension to its field of fractions.

        3. Let F be an ordered field. Show that F[x], x an indeterminate, has an ordering defined by a0xn + a1xn – 1 + … > 0 if a0 > 0.

        4. Call an ordered field F archimedean-ordered if for any a > 0 in F there exists a positive integer n such that n( = n 1)> a. Show that the field F(x), x an indeterminate, ordered by using exercises 2 and 3 is not archimedean.

        5. Prove that any archimedean-ordered field is order isomorphic to a subfield of images.

        6. (T. Springer.) Let Q be an anistropic quadratic form on a finite dimensional vector space V over a field of characteristic ≠ 2. Let E be an odd dimensional extension field of F and let QE be the extension of Q to a quadratic form on VE. Show that QE is anisotropic. (Hint: It suffices to assume that E = F(ρ). Prove the result by induction on [E: F] using an argument like that in the second part of the proof of Lemma 2.)

11.2   REAL CLOSURES

DEFINITION 11.2.   Let F be an ordered field. An extension field R of F is called a real closure of F if (1) R is real closed and algebraic over F and (2) the (unique) order in R is an extension of the given order in F.

A central result in the Artin-Schreier theory of formally real fields is the existence and uniqueness of a real closure for any ordered field F. For the proof of this result we shall make use of Sturm’s theorem (BAI, p. 312), which permits us to determine the number of roots in a real closed field R of a polynomial f(x) ∈ R[x]. Let f(x) = xn + a1xn – 1 + … + an and M = l + |a1| + … + |an| where | | is defined as usual. Then every root of f(x) in R lies in the interval – M < x < M (BAI, exercise 4, p. 311). Define the standard sequence for f(x) by

images

for i ≥ 1. Then we have an s such that fs + 1 = 0, and by Sturm’s theorem, the number of roots of f(x) in R is VMVM where Va is the number of variations in sign in f0(a),f1(a), …, fs(a). We can use this to prove the

LEMMA.   Let Ri, i = 1, 2, be a real closed field, Fi a subfield of Ri, a images images an order isomorphism of F1 onto F2 where the order in Fi is that induced from Ri. Suppose that f(x) is a monic polynomial in F1[x], images the corresponding polynomial in F2[x]. Then f(x) has the same number of roots in R1 as images has in R2.

Proof. If M is as above, then the first number is VMVM and the second is images determined by the standard sequence for images. Since a images images is an order isomorphism of F1 onto F2, it is clear that these two numbers are the same.images

We can now prove the important

THEOREM 11.4.   Any ordered field has a real closure. If F1 and F2 are ordered fields with real closures R1 and R2 respectively, then any order isomorphism of F1 onto F2 has a unique extension to an isomorphism of R1 onto R2 and this extension preserves order.

Proof.   Let F be an ordered field, images an algebraic closure of F, and let E be the subfield of images obtained by adjoining to F the square roots of all the positive elements of F. Then E is formally real. Otherwise, E contains elements ai such that ∑ai2 = – 1. The ai are contained in a subfield generated over F by a finite number of square roots of positive elements of F. Using induction on the first part of Lemma 2 of section 11.1 we see that this subfield is formally real contrary to ∑ai2 = – 1 with ai in the subfield. We now consider the set of formally real subfields of images containing E. This set is inductive, so by Zorn’s lemma, we have a maximal subfield R in the set. We claim that R is real closed. If not, there exists a proper algebraic extension R′ of R that is formally real (Theorem 11.2). Since images is an algebraic closure of R, we may assume that R′ images so we have imagesR′ images R. This contradicts the maximality of R. Hence R is real closed. Now suppose aF and a > 0. Then a = b2 for bER and hence a > 0 in the order defined in R. Thus the order in R is an extension of that of F and hence R is a real closure of F.

Now let F1 and F2 be ordered fields, Ri a real closure of Fi, and let σ: a images images be an order isomorphism of F1 onto F2. We wish to show that σ can be extended to an isomorphism ∑ of R1 onto R2. The definition of ∑ is easy. Let r be an element of R1, g(x) the minimum polynomial of r over F1 and let r1 < r2 < … < rk = r < … < rm be the roots of g(x) in R1 arranged in increasing order. By the lemma, the polynomial images has precisely m roots in R2 and we can arrange these in increasing order as images1 < images2 < … < imagesm. We now define ∑ as the map sending r into the kth one of these roots. It is easy to see that E is bijective and it is clear that ∑ is an extension of σ. However, it is a bit tricky to show that ∑ is an isomorphism. To see this we show that if S is any finite subset of R1, there exists a subfield E1 of R1/F1 and a monomorphism η of E1/F1 into R2/F2 that extends σ and preserves the order of the elements of S, that is, if S = {s1 < s2 < … < sn}, then ηs1 < ηs2 < … ηsn. Let T = images and let E1 = F1(T). Evidently, E1 is finite dimensional over F1 and so E1 = F1(w). Let f(x) be the minimum polynomial of w over F1. By the lemma, images has a root images in R2, and we have a monomorphism η of E1/F1 into R2/F2 sending w into images.Now images. Hence η preserves the order of the si.Now let r and s be any two elements of E1 and apply the result just proved to the finite set S consisting of the roots of the minimum polynomials of r, s, r + s, and rs. Since η preserves the order of the elements of S, η(r) = ∑(r), η(s) = ∑(s), η(r + s) = ∑(r + s), and η(rs) = ∑(rs). Hence ∑(r + s) = η(r + s) = η(r) + η(s) = ∑(r) + (s) and similarly ∑(rs) = ∑(r)∑(s). Thus ∑ is an isomorphism.

It remains to show that ∑ is unique and is order preserving. Hence let ∑′ be an isomorphism of R1 onto R2. Since ∑′ maps squares into squares and the subsets of positive elements of the Ri are the sets of non-zero squares, it is clear that ∑′ preserves order. Suppose also that ∑′ is an extension of σ. Then it is clear from the definition of ∑ that ∑′ = ∑. This completes the proof.images

If R1 and R2 are two real closures of an ordered field F, then the identity map on F can be extended in a unique manner to an order isomorphism of R1 onto R2. In this sense there is a unique real closure of F.

It is easily seen that the field images of rational numbers has a unique ordering. Its real closure images0 is called the field of real algebraic numbers. The field images is the algebraic closure of images. This is the field of algebraic numbers.

EXERCISES

        1. Let F be an ordered field, E a real closed extension field whose order is an extension of the order of F. Show that E contains a real closure of F.

        2. Let F be an ordered field, E an extension field such that the only relations of the form ∑aibi2 = 0 with ai > 0 in F and biE are those in which every bi = 0. Show that E can be ordered so that its ordering is an extension of that of F.

11.3   TOTALLY POSITIVE ELEMENTS

An interesting question concerning fields is: what elements of a given field can be written as sums of squares? We consider this question in this section and in the next two sections; we first obtain a general criterion based on the following definition.

DEFINITION 11.3.   An element a of a field F is called totally positive if a > 0 in every ordering of F.

It is understood that if F has no ordering, then every element of F is totally positive. Hence this is the case if F is not formally real. If F is not formally real, then – 1 = ∑ai2 for aiF, and if char F ≠ 2, then the relation

images

shows that every element of F is a sum of squares. We shall need this remark in the proof of the following criterion.

THEOREM 11.5.   Let F be a field of characteristic ≠ 2. Then an element a ≠ 0 in F is totally positive if and only if a is a sum of squares.

Proof.   If 0 ≠ a = ∑ai2, then evidently a > 0 in every ordering of F. Conversely, assume that a ≠ 0 is not a sum of squares in F. Let images be an algebraic closure of F and consider the set of subfields E of images/F in which a is not a sum of squares. By Zorn’s lemma, there is a maximal one; call it R. By the preceding remark, R is formally real and hence R can be ordered. We claim that – a is a square in R. Otherwise, the subfield images of images/F properly contains R, so a is a sum of squares in images. Hence we have biCiR such that a = images. This gives ∑bici = 0 and a = ∑bi2aci2 Hence a(1 + ∑ci2) = ∑bi2 and 1 + ∑ci2 ≠ 0, since R is formally real. Then if c = 1 + ∑ci2,

images

so a is a sum of squares in R, contrary to the definition of R. Thus – a = b2 for a b ∈ R and hence a = –b2 is negative in every ordering of R. These orderings give orderings of F and so we have orderings of F in which a < 0. Thus a is not totally positive.images

We shall apply this result first to determine which elements of a number field F are sums of squares. We have F = images(r) where r is algebraic over images. If images0 is the field of real algebraic numbers ( = the real closure of images), then images0 = images0(images) is an algebraic closure of images and of F. If n = [F: images], then we have n distinct monomorphisms of F/images into C0/images. These are determined by the maps r images ri, 1 ≤ in, where the ri are the roots of the minimum polynomial g(x) of r over images. Let r1, …, rh be the ri contained in images0. We call these the real conjugates of r and we agree to put h = 0 if no riimages0. Let σi, 1 ≤ ih, be the monomorphism of F/images such that σir = ri. Each σi defines an ordering of F by declaring that a > 0 in F if σia > 0 in the unique ordering of images0. We claim that these r orderings are distinct and that they are the only orderings of F. First, suppose that the orderings defined by σi and σj are the same. Then σjσi– 1 is an order-preserving isomorphism of the subfield images(ri) of images0 onto the subfield images(rj). Since images0 is algebraic over images(ri) and images(rj), images0 is the real closure of these fields. Hence, by Theorem 11.4, σjσi– 1 can be extended to an automorphism σ of images0. Since images0 is the real closure of images, it follows also from Theorem 11.4 that the only automorphism of images0 is the identity. Hence σ = 1 and σj = σi. Next suppose that we have an ordering of F and let R be the real closure defined by this ordering. Since R is algebraic over images, R is also a real closure of images. Hence we have an order isomorphism of R/images onto images0/images. The restriction of this to F coincides with one of the σi and hence the given ordering coincides with the one defined by this σi.

It is clear that the σi that we have defined can be described as the monomorphisms of F into images. Hence we have proved the following

THEOREM 11.6.   Let F be an algebraic number field, images0 the field of real algebraic numbers. Then we have a 1 – 1 correspondence between the set of orderings of F and the set of monomorphisms of F into images0. The ordering determined by the monomorphism σi is that in which a > 0 for af if σia> 0 in imageso .

An immediate consequence of Theorems 11.5 and 11.6 is the following result due to Hilbert and E. Landau:

THEOREM 11.7.   Let F be an algebraic number field and let σ1, …,σh(h ≥ 0) be the different monomorphisms of F into the field images0 of real algebraic numbers. Then an element af is a sum of squares in F if and only if σia > 0 for1 ≤ ih.

EXERCISES

        1. Let F be an ordered field, E an extension field. Show that if b is an element of E that cannot be written in the form ∑aibi2 for ai ≥ 0 in F and bE, then there exists an ordering of E extending the ordering of F in which b < 0.

        2. Let F be an ordered field, R the real closure of F, and E a finite dimensional extension of F. Prove the following generalization of Theorem 11.6: There is a 1 – 1 correspondence between the set of orderings of E extending the ordering of F and the set of monomorphisms of E/F into R/F.

        3. (I. Kaplansky-M. Kneser.) Let F be a field of characteristic ≠ 2 that is not formally real. Suppose that |F*/F*2| = n < ∞. Show that any non-degenerate quadratic form in n variables is universal. (Sketch of proof: Let a1,a2,… be a sequence of non-zero elements of F and let Mk denote the set of values of the quadratic form a1x12 + … + akxk2 for xiF. Show that there exists a kn such that Mk + 1 = Mk. Then Mk + 1 = ak + 1F2 + Mk = Mk where F2 is the set of squares of elements of F. Iteration of Mk = ak + 1F2 + Mk gives Mk = ak + 1(F2 + … + F2) + Mk. Hence conclude that Mk = F + Mk and Mk = F.)

11.4   HILBERT′ S SEVENTEENTH PROBLEM

One of the problems in his list of twenty-three unsolved problems that Hilbert proposed in an address before the 1900 Paris International Congress of Mathematicians was the problem on positive semi-definite rational functions: Let f be a rational function of n real variables with rational coefficients that is positive semi-definite in the sense that f(a1, … an) ≥ 0 for all real (ai …,an) for which f is defined. Then is f necessarily a sum of squares of rational functions with rational coefficients? By a rational function of n real variables with rational coefficients we mean a map of the form (a1, … an)images f (a1, … an)where f(x1, … xn) = g(x1, … xn)/ h(x1, … xn) and g and h are polynomials in the indeterminates xi with rational coefficients. The domain of definition of the function is a Zariski open subset defined by h(a1, … an) ≠ 0 and two rational functions are regarded as equal if they yield the same values for every point of a Zariski open subset.

In 1927, making essential use of the Artin-Schreier theory of formally real fields, Artin gave an affirmative answer to Hilbert’s question by proving the following stronger result:

THEOREM OF ARTIN.   Let F be a subfield of images that has a unique ordering and let f be a rational function with coefficients in F such that f(a1, … an) ≥ 0 for all aiF for which f is defined. Then f is a sum of squares of rational functions with coefficients in F.

Examples of fields having a unique ordering are images, images, and any number field that has only one real conjugate field.

The condition that F is a subfield of images in Artin’s theorem can be replaced by the hypothesis that F is archimedean ordered. It is easily seen that this condition is equivalent to the assumption that Fimages of Artin’s theorem. We shall prove a result that is somewhat stronger than Artin’s in that the archimedean property of F will be replaced by a condition of density in the real closure. If F is a subfield of an ordered field E, then F is said to be dense in E if for any two elements a < b in E there exists a cf such that a < c < b. It is easily seen that images is dense in this sense in images and this implies that any subfield of images is dense in images. Hence the following theorem is a generalization of Artin’s theorem.

THEOREM 11.8.   Let F be an ordered field such that (1) F has a unique ordering and (2) F is dense in its real closure. Let f be a rational function with coefficients in F such that f(a1,…,an) ≥ 0 for all (a1…,an) ∈ F(n) for which f is defined. Thenf is a sum of squares of rational functions with coefficients in F.

We shall give a model theoretic type of proof of Theorem 11.8 based on the following result, which was proved in BAI, p. 340.

Let R1 and R2 be real closed fields having a common ordered subfield F, that is, the orderings on F induced by R1 and R2 are identical. Suppose that we have a finite set S of polynomial equations, inequations (f ≠ 0), and inequalities (f > 0) with coefficients in F. Then S has a solution in R1 if and only if it has a solution in R2.

We now proceed to the

Proof of Theorem 11.8. The set of rational functions of n variables with coefficients in F form a field with respect to the usual definitions of addition and multiplication. If pi, 1 ≤ in, denotes the function such that Pi(a1 …,an) = ai, then we have an isomorphism of the field F(x1,…xn), xi indeterminates, onto the field of rational functions such that xi images pi. Accordingly, the latter field is F(p1,…,pn). Suppose that fF(p1,…,pn) is not a sum of squares. Then by Theorem 11.5, there exists an ordering of F(p1,…,pn) such that f < 0. Write f = gh– 1 where g, hF[p1,…,pn]. Then gh < 0 so if k(x1…,xn) = g(x1,…,xn)h(x1,…,xn), then k(p1,…,pn) < 0 and the inequality k(x1,…,xn) < 0 has the solution (p1,…,pn) in F(p1,…,pn) and a fortiori in the real closure R of F(p1,…,pn). Now consider the real closure R0 of F. Since F has a unique ordering, the orderings of F in R0 and in R are identical. Moreover,k(x1,…,xn) ∈ F[x1,…,xn]. Hence by the result quoted, there exist aiR0 such that k(a1,…,an) < 0 and hence such that f(a1,…,an) < 0. The proof will be completed by showing that the ai can be chosen in F.

LEMMA.   Let F be an ordered field that is dense in its real closure R and suppose that for k(x1,…,xn) ∈ R[x1,…,xn]there exist aiR such that k(a1,…,an) < 0. Then there exist biF such that k(b1,…,bn) < 0.

Proof.   We use induction on n. If n = 1, let a′ be chosen < a ( = a1) so that the interval [a′ , a] contains no root of k. Then k(x) < 0 for all x in [aa] (BAI, p. 310) and we may choose x = bF in [a′ , a]. Then k(b) < 0. Now assume the result for n – 1 variables. Then the one-variable case shows that there exists a b1F such that k(b1,a2,…,an) < 0 and the n – 1-variable case implies that there exist b2,…,bn such that k(b1,b2,…,bn) < 0.images

Remark. It has been shown by K. McKenna that the following converse of Theorem 11.8 holds: If F is an ordered field such that any rational function f that is positive semi-definite in the sense that f(a1,…,an) ≥ 0 for all (a1,…,an) for which F is defined is a sum of squares, then F is uniquely ordered and is dense in its real closure.

EXERCISES

        1. (J. Keisler.) Let F be ordered and let the extension field F(x), x transcendental, be ordered as in exercise 4, p. 634. Show that F(x) is not dense in its real closure by showing that there is no element of F(x) in the interval images.

        2. Let R be real closed and let f(x) ∈ R[x] satisfy f(a) ≥ 0 for all aR. Show that f(x) is a sum of two squares in R[x].

        3. (C. Procesi.) Let R be a real closed field and let V be an irreducible algebraic variety defined over R, F the field of rational functions on V (see exercise 7, pp. 429–430). Let h1…,hkF and let X be the set of points ρ of V such that hi(ρ) ≥ 0. Show that if gF satisfies g(ρ) ≥ 0 for all ρX on which g is defined, then g has the form

images

where ∑′ indicates summation on the indices between 1 and k in strictly increasing order and the si1ij are sums of squares in F.

   (Sketch of proof. The conclusion is equivalent to the following: g is a sum of squares in F1 = Fimages. There is no loss in generality in assuming that g and the hi are polynomials in the coordinate functions (p1, …, pn)(as in the proof of Artin’s theorem), that is, we have polynomials g(xl,…,xn),hj(xl,…,xn), 1 ≤ jk, in indeterminates xi with coefficients in R such that g = g(pl,…,pn), hj = hj(pl,…,pn). If g is not a sum of squares in F1 then there exists an ordering of F1 such that g < 0. Let f1(xl,…,xn),…,fm(xl,…,xn) be generators of the prime ideal in R[xl,…,xn] defining V. Then we have f1(pl,…,pn) = 0,…,fm(pl,…,pn) = 0, h1(pl,…,pn) ≥ 0,…,hk(pl,…,pn) ≥ 0, g(pl,…,pn) < 0 in F1 and hence in a real closure R1 of F1. Consequently, we have (al,…,an),aiR, such that f1(al,…,an) = 0,…,fm(al,…,an) = 0,h1(al,…,an) ≥ 0,…,hk(al,…,an) ≥ 0,g(a1,…,an) < 0 This contradicts the hypothesis)

        4. (J. J. Sylvester.) Let f(x) ∈ R[x], R real closed, x an indeterminate, and let A = R[x]/(f(x)). Let T(a, b) be the trace bilinear form on A/R and Q(a) = T(a, a) the corresponding quadratic form. Show that the number of distinct roots of f(x) in R is the signature of Q (BAI, p. 359).

        5. Let F be a field, f(x) a monic polynomial in F[x], and let f(x) = images in a splitting field E/F of f(x). Put Sj = ∑ni = 1 rij, so sjF. Then the matrix

images

is called the Bezoutiant of f(x). Show that the determinant of the Bezoutiant is the discriminant of the trace form T(a,b) on A = F[x]/(f(x)) determined by the base images, images,…,imagesn – 1 where imagesi = xi + (f(x)).

        6. (Sylvester.) Notations as in exercises 4 and 5. Let bk denote the sum of the k rowed principal (=diagonal) minors of the Bézoutiant of f(x) (hence, the characteristic polynomial of the Bézoutiant is xnb1xn – 1 + … + (– 1)nbn). Show that all the roots of f(x) are in R if and only if every bi ≥ 0.

        7. (Procesi.) Let R be a real closed field and let g be a symmetric rational function of n variables with coefficients in R such that g(a1,…,an) ≥ 0 for all (a1,…,an) ∈ R(n) on which g is defined. Let bk denote the sum of the k-rowed minors of the Bézoutiant of f(x) = images. Show that g has the form ∑’Si1i, bi1bi, where ∑’ indicates summation on the indices between 1 and k in strictly increasing order and the Si1i, are sums of squares of symmetric functions.

11.5   PFISTER THEORY OF QUADRATIC FORMS

If an element of a field is a sum of squares, can we assert that it is a sum of n squares for a specified n? It is not difficult to see that if R is a real closed field, then any element of R(x) that is a sum of squares is a sum of two squares. Hilbert showed that if a rational function of two variables over R is positive semi-definite, then it is a sum of four squares. In the next section we shall prove Pfister’s theorem that if R is real closed, any element of R(x1,…, xn) that is a sum of squares is a sum of 2n squares. We shall also sketch in the exercises of section 11.6 a proof of a theorem of Cassels that there exist elements in R(x1,…, xn) that are sums of squares but cannot be written as sums of fewer than n squares. The exact value of the number k(n) such that every element in R(x1,…, xn) that is a sum of squares is a sum of k(n) squares is at present unknown. The results we have indicated give the inequalities nk(n) ≤ 2n.

The proof of Pfister’s theorem is based on some results on quadratic forms that are of considerable independent interest. We devote this section to the exposition of these results.

We deal exclusively with quadratic forms Q on finite dimensional vector spaces V over a field F of characteristic ≠ 2. As usual, we write B(x, y) = Q( x + y ) – Q(x) – Q(y). Since the characteristic is ≠ 2, it is preferable to replace B(x, y) by Q(x, y) = images B(x, y). Then Q(x, x) = Q(x). We shall now indicate that V has an orthogonal base (u1,…, un) relative to Q such that Q(ui) = bi by writing

images

We shall also write

images

if the quadratic forms ∑n1 bixi2 and ∑n1 cixi2 are equivalent.

We now introduce some concepts and results on quadratic forms that we have not considered before. First, we consider the tensor product of quadratic forms. Let Vi, i = 1, 2, be a vector space over F, Qi a quadratic form on Vi. We shall show that there is a unique quadratic form Q1 images Q2 on V1 images V2 such that

images

for vi ∈ Vi. Let (u1(i),…, un(i) ) be a base for Vi/F. Then (ui(1) images uj(2)) is a base for V1 images V2 and we have a quadratic form Q on V1 images V2 such that

images

If v1 = ∑aiui(1) and v2 = ∑bjuj(2) then

images

Putting Q = Q1 images Q2 we have (5) and since the vectors v1 images v2 span V1 images V2, it is clear that Q1 images Q2 is unique.

If (u1(i)),…,un(i)) is an orthogonal base for Vi then (uj(1) images uk(2)) is an orthogonal base for V1 images V2 and if

images

then

images

images

where the tensor product of matrices is as defined on p. 250.

In a similar manner, we can define the tensor product of more than two quadratic forms and we have the following generalization of (7): If Qi ~ diag {b1(i),…,bn1(i)), then

images

It is also convenient to define the direct sum of two quadratic forms. If Qi, i = 1, 2, is a quadratic form on Vi then Q1 images Q2 is defined to be the quadratic form on V1 images V2 such that

images

It is clear that Q1 images Q2 is well-defined and if V1 and V2 are identified in the usual way with subspaces of V1 images V2, then these subspaces are orthogonal relative to the bilinear form of Q1 images Q2. If (u1(i),…,un1(i)) is an orthogonal base for Vi, then (u1(1),…,un1(1),u1(2),…,un2(2)) is an orthogonal base for V1 images V2. We have

images

if Qi ~ diag {b1(i),…, bn(i)}. If B and C are matrices, it is convenient to denote the matrix images by B images C. Using this notation we can rewrite

images

We now consider Pfister’s results on quadratic forms that yield the theorem on sums of squares stated at the beginning of the section. Our starting point is a weak generalization of A. Hurwitz’s theorem on sums of squares (see BAI, pp. 438–451). Hurwitz proved that there exist identities of the form images where the zi depend bilinearly on the x’s and the y’s if and only if n = 1, 2,4 or 8. Pfister has shown that for any n that is a power of two, the product of any two sums of squares in a field F is a sum of squares in F. Thus at the expense of dropping the requirement that the zi depend bilinearly on the x’s and y’s, we have images that is a power of two. More generally, we consider quadratic forms Q that are multiplicative in the sense that given any two vectors x and y there exists a vector z such that Q(x)Q(y) = Q(z). A stronger condition on Q is given in

DEFINITION 11.4.   A quadratic form Q is said to be strongly multiplicative if Q is equivalent to cQ for any c ≠ 0 represented by Q.

This means that there exists a bijective linear transformation η of V such that cQ(x) = Q(ηx) for all x. Then if Q(y) = c, Q(x)Q(y) = Q(z) for z = ηx; hence Q strongly multiplicative implies Q multiplicative. If Q is of maximal Witt index (BAI, p. 369) on an even dimensional space, then Q can be given coordinate-wise as images it is easily seent that Q is strongly multiplicative. On the other hand, the quadratic form Q = x12x22x32 on a three-dimensional vector space over images is multiplicative but not strongly multiplicative. The fact that it is multiplicative follows from the observation that any form that is universal has this property. On the other hand, Q is not equivalent to – Q by Sylvester’s theorem (BAI, p. 359). For any quadratic form Q on a vector space V/F we let F*Q denote the set of non-zero elements of F represented by Q. It is clear that F*Q is closed under multiplication if and only if Q is multiplicative. Since Q(x) = c ≠ 0 implies Q(c– 1x) = c– 1, it is clear that Q is multiplicative if and only if F*Q is a subgroup of F*.

We proceed to derive Pfister’s results. We give first a criterion for equivalence in the binary case.

LEMMA 1. diag {b1 b2} ~ diag {c1 c2} if and only if c1 is represented by b1x12 + b2x22 and b1b2 and c1c2 differ by a square factor.

Proof.   Since b1b2 is a discriminant of Q = b1x12 + b2x22 and the bi are represented by Q, it is clear that the conditions are necessary. Now suppose they hold. By the first condition we have a vector y such that Q(y) = c1. Then diag {b1, b2} ~ diag {c1, c} and c1c = k2b1b2, kF*. Also C1C2 = l2b1b2. Hence c2 = n2c and we can replace c by c2. Thus diag {b1b2} ~ diag {c1, c2}.images

We prove next the key lemma:

LEMMA 2.   Let Q be a strongly multiplicative quadratic form, a an element of F*. Let Qa ~ diag {1, a}. Then Qa images Q is strongly multiplicative.

Proof.   It is clear that Qa images Q is equivalent to Q images aQ. Hence, it suffices to show that the latter is strongly multiplicative. We now use the notation ~ also for equivalence of quadratic forms and if Q1 ~ diag {a, b}, then we denote Q1 images Q2 by diag {a, b] images Q2 ~ aQ2 images bQ2. Let k be an element of F* represented by Q images aQ, so k = b + ac where b and c are represented by Q (possibly trivially if b or c is 0). We distinguish three cases:

Case I. c = 0. Then k = b and Q ~ bQ. Hence Q images aQ ~ bQ images abQ = b(Q images aQ) = k(Q images aQ).

Case II. b = 0. Then k = ac and k(Q images aQ) = kQ images kaQ = acQ images a2cQ ~ aQ images Q since cQ ~ Q by hypothesis and Q ~ a2Q for any aF*. Thus k(Q images aQ)~ Q images aQ.

Case III. bc ≠ 0. We have images. Since k = b + ac is represented by bx12 + acx22 and bac and k2abc differ by a square, it follows from Lemma 1 that diag {b, ac} ~ diag {k, kabc}. Hence diag {b, ac] images Q ~ diag {k, kabc} images Q ~ k diag {1, abc} images Q ~ kQ images kabcQ ~ kQ images kaQ = k(Q images aQ).

In all cases we have that images, so Q images aQ is strongly multiplicative.images

It is clear that the quadratic form Q0 = x2 ~ diag {1} is strongly multiplicative. Hence iterated application of Lemma 2 gives

THEOREM 11.9.   If the aiF*, then

images

is a strongly multiplicative quadratic form. In particular, images (n factors) is strongly multiplicative.

We shall call the forms given in Theorem 11.9 Pfister forms of dimension 2n. We prove next a type of factorization theorem for such forms.

THEOREM 11.10.   Write

images

and let Q be a quadratic form such that Q ~ D. Suppose that b1 ≠ 0 is represented by Q. Then there exist b2,…, bnF* such that

images

Proof.   By induction on n. If n = 1, then b1 = a1c2 and hence diag{l, a1} ~ diag {1, b1}. Now assume the result for n and consider

images

Suppose that b1 ≠ 0 is represented by Q′ such that Q′ ~ D′ . Then b1 = b′ 1 + ab where b′ 1 is represented by Q ~ D and b is represented by x2 images Q ~ diag {1, a1} imagesimages diag {1, an}.

Case I. b = 0. Then b′ ≠ 0 and the induction hypothesis gives elements b2,…,bnF* such that (12) holds. Then

images

Case II. b′ 1 = 0. Then b ≠ 0, b1 = ab, and

images

Case III. bb′ 1 ≠ 0. The equivalence established in Case II permits us to replace a by ab. Then by Case I we have b2,…,bnF* such that

images

Now

images

Substituting this in the first equivalence displayed, we obtain the result in this case.images

We are now ready to derive the main result, which concerns the representation of sums of squares by values of Pfister forms.

THEOREM 11.11.   Suppose that every Pfister form of dimension 2n represents every non-zero sum of two squares in F. Then every Pfister form of dimension 2n represents every non-zero sum of k squares in F for arbitrary k.

Proof.   By induction on k. Since any Pfister form represents 1, the case k = 1 is clear and the case k = 2 is our hypothesis. Now assume the result for k ≥ 2. It suffices to show that if Q is a Pfister form of dimension 2n and a is a sum of k squares such that c = 1 + a ≠ 0, then c is represented by Q. This will follow if we can show that Q imagescQ represents 0 non-trivially. For then we shall have vectors u and v such that Q(u) = cQ(v) where either u ≠ 0 or v ≠ 0. If either Q(u) = 0 or Q(v) = 0, then both are 0 and so Q represents 0 nontrivially. Then Q is universal and hence represents c. If Q(u) ≠ 0 and Q(v) ≠ 0, then these are contained in F*Q and hence c = Q(u)Q(v)– 1F*Q, so c is represented by Q. We now write Q = x2 images Q′ . Since Q represents a, we have a = a12 + a′ where Q′ represents a′ . We have diag images represents images. If this is 0, then c = a′ is represented by Q. Hence we may assume that 1 + a12 ≠ 0. Then by Theorem 11.10, diag {1, – c} images Q ~ diag {1, – 1 – a12} images Q" where Q"is a Pfister form of dimension 2n. By the hypothesis, this represents l + a12. It follows that diag {l, – 1 – a12} images Q" represents 0 non-trivially. Then Q imagescQ ~ diag {1, – 1 –a12 } images Q″ represents 0 non-trivially. This completes the proof.images

11.6   SUMS OF SQUARES IN R(x1,…,xn), R A REAL CLOSED FIELD

We now consider the field R(x1,…,xn) of rational expressions in n indeterminates x1,…,xn over a real closed field R. We wish to show that Theorem 11.11 can be applied to the field F = R(x1,…,xn) For this purpose we need to invoke a theorem that was proved by C. C. Tsen in 1936 and was rediscovered by S. Lang in 1951. To state this we require the following

DEFINITION 11.5.   A field F is called a Ci-field if for any positive integer d, any homogeneous polynomial f with coefficients in F of degree d in more than di indeterminates has a non-trivial zero in F(dl).

By a non-trivial zero we mean an element (a1,…,ad) ∈ Fd such that (a1,…,ad) ≠ (0, …, 0) and f(a1,…,ad) = 0. The theorem we shall require is the

THEOREM OF TSEN-LANG.   If F is algebraically closed and the x’s are indeterminates, then F(x1,…,xn)is a Cn-field.

It is readily seen that any algebraically closed field is a C0 field. Now suppose that F is not algebraically closed. Then there exists an extension field E/F such that [E:F] = n > 1 and n is finite. Let (u1, …, un) be a base for E/F and let ρ be the regular matrix representation determined by this base (BAI, p. 424). Then if images unless every ai = 0. Let x1, …, xn be indeterminates and put N(x1, …, xn) = det (∑xiρ(ui)). This is a homogeneous polynomial of degree n and N(a1, …, an) = NE/F(U)). Hence N has no zero except the trivial one (0, …, 0) and hence F is not a C0-field. Thus a field is a C0-field if and only if it is algebraically closed.

C1-fields were introduced by Artin, who called these fields quasi-algebraically closed. We have indicated in a series of exercises in BAI that any finite field is a C1-field (see exercises 2–7 on p. 137).

The proof of the Tsen-Lang theorem is an inductive one. The proof of the initial step that if F is algebraically closed then F(x) is C1 and the proof of the inductive step will make use of the following result, which is a consequence of Theorem 7.29, p. 453.

LEMMA 1.   Let F be an algebraically closed field, f1, …, fr polynomials without constant terms in n indeterminates with coefficients in F. If n > r, then the system of equations f1(x1, …, xn) = 0, … ,fr(x1, …, xn) = 0 has a non-trivial solution in Fn

We need also an analogous result for systems of polynomial equations in a Ci-field, which we derive below. First, we give the following

DEFINITION 11.6.   A polynomial with coefficients in F is called normic of order i ( > 0) for F if it is homogeneous of degree d > 1 in di indeterminates and has only the trivial zero in F.

The argument used above shows that if F is not algebraically closed, then there exist normic polynomials of order 1 for F. If F is arbitrary and t is an indeterminate, then it is readily seen that F(t) is not algebraically closed (for example, images. Hence there exist normic polynomials of order 1 for F(t). We have also

LEMMA 2.   If there exists a normic polynomial of order i for F, then there exists a normic polynomial of order i + 1 for F(t), t an indeterminate.

Proof.   Let N(x1, …, xd′) be a normic polynomial of order i for F. The degree of N is d. We claim that

images

is a normic polynomial of order i + 1 for F(t). Since this polynomial is homo geneous of degree d in di + 1 indeterminates, it suffices to show that it has only the trivial zero in F(t). Hence suppose that (a1, …, adi + 1) is a non-trivial zero of (14). Since the polynomial is homogeneous, we may assume that the aiF [t] and not every ai is divisible by t. Let ak be the first one that is not, and suppose that jdi + 1 ≤ k ≤ (j + l)di. Then dividing by tj gives the relation images(mod t) and hence reducing modulo t gives images where images is the constant term of a. Since imagesk ≠ 0, this contradicts the hypothesis that N is normic and completes the proof.images

If N(x1, …, xd′) is a normic polynomial of order i and degree d, then

images

is homogeneous of degree d2 in d2i indeterminates. Moreover, this has only the trivial zero in F. Hence (15) is normic. Since d > 1, we can apply this process to obtain from a given normic polynomial normic polynomials of arbitrarily high degree. We shall use this remark in the proof of an analogue of Lemma 1 for Ci-fields:

LEMMA 3 (Artin).   Let F be a Ci-field for which there exists a normic polynomial of order i (i > 0). Let f1, …, fr be homogeneous polynomials of degree d in the indeterminates x1, …, xn with coefficients in F. If n > rdi, then the f’s have a common non-trivial zero in F.

Proof. Let N be a normic polynomial of order i for F. Suppose the degree of N is e. Then the number of indeterminates in N is ei, which we can write as ei = rs + t where s ≥ 0 and 0 ≤ t < r. We replace the first r indeterminates in N by f1(x1, …, xn), …, fr(x1, …, xn) respectively, the next r by f1(xn + 1, …, x2n), …, fr(xn + 1, …, x2n), etc., and the last t indeterminates in N by 0’s. This gives the polynomial

images

which is homogeneous in ns indeterminates. Moreover, deg M = ed. We want to have ns > (ed)i = di(rs + t). This will be the case if (ndir)s > dit. Since n > dir and 0 ≤ t < r, this can be arranged by choosing e large enough— which can be done by the remark preceding the lemma. With our choice of e we can conclude from the Ci-property that M has a non-trivial zero in F. Since N is normic, it follows easily that the fi have a common non-trivial zero in F.images

In the next lemma, for the sake of uniformity of statement, we adopt the convention that 1 is a normic polynomial of order 0. Then we have

LEMMA 4.   Let F be a Ci-field, i ≥ 0, such that there exists a normic polynomial of order i for F. Then F(t), t indeterminate, is a Ci + 1-feld.

Proof. Let f(x1, …, xn) be a homogeneous polynomial of degree d with coefficients in F(t) such that n > di + 1. We have to show that f has a non-trivial zero in F(t). There is no loss in generality in assuming that the coefficients of f are polynomials in t. Let r be the degree of f in t. Put

images

where the xjk are indeterminates. Then

images

where the fi are homogeneous polynomials of degree d in the n(s + l) xjk with coefficients in F. The polynomial f will have a non-trivial zero in F(t) if the fi have a common non-trivial zero in F. By Lemmas 1 and 3 this will be the case if n(s + 1) > (sd + r + l)di Since n > di + 1, the inequality can be satisfied by taking s sufficiently large.images

The proof of the Tsen-Lang theorem is now clear. First, Lemma 4 with i = 0 shows that if F is algebraically closed, then F(x1) is a C1-field. Next, iterated application of Lemma 2 shows that there exists a normic polynomial of order i for F(x1, …, xi). Then iterated application of Lemma 4 implies that F(x1, …, xn) is a Cn-field.images

We can apply this result for the case d = 2 to conclude that if F is an algebraically closed field, then any quadratic form on a vector space V of 2n + 1 dimensions over F(x1, …, xn) represents 0 non-trivially. It follows that any quadratic form on a vector space of 2n dimensions over F(x1, …, xn) is universal. We shall make use of this result in the proof of

THEOREM 11.12.   Let R be a real closed field and let Q be a Pfister form on a 2n-dimensional vector space over the field R(x1, …, xn). Then Q represents every non-zero sum of two squares in R(x1, …, xn).

Proof. Let Q be a Pfister form on a 2n-dimensional vector space V over R(x1, …, xn). We have to show that if b = b12 + b22 ≠ 0, biR(x1, …, xn), then b is represented by Q. Since Q represents 1, the result is clear if b1b2 = 0. Hence we assume b1b2 ≠ 0. Let C = R(i), i2 = – 1, and consider the extension field C(x1, …, xn) of R(x1, …, xn) and the vector space images is a base for C/R, then this is a base for C(x1, …, xn) over R(x1, …, xn). Moreover, every element of images can be written in one and only one way as images (identified with a subspace of images in the usual way). The quadratic form Q has a unique extension to a quadratic form images on images. Evidently images is a Pfister form. Now put q = b1 + b2i. Then (1 ,q) is a base for C/R and images. There exists a vector images = u1 + qu2, uiV, such that images(images) = q. Then Q(u1) + 2qQ(u1, u2) + q2Q(u2) = q. Since (1 ,q) is a base for C(x1, …, xn)/R(x1, …, xn) and q2 – 2b1q + b = 0, this implies that Q(u1) = bQ(u2). It follows that b is represented by Q.images

If we combine Artin’s theorem (p. 640) with Theorems 11.11 and 11.12, we obtain the main result:

THEOREM 11.13.   Let R be a real closed field. Then any positive semi- definite rational function of n variables over R is a sum of 2n squares.

More generally, the same theorems show that any positive semi-definite rational function of n variables in R can be represented by a Pfister form of dimension 2n. It is noteworthy that to prove Theorem 11.13 we had to use Pfister forms other than the sum of squares.

EXERCISES

        1. (J. W. S. Cassels.) Let F be a field of characteristic ≠ 2, x an indeterminate. Let p(x) ∈ F[x] be a sum of n squares in F(x). Show that p(x) is a sum of n squares in F[x].

   (Outline of proof The result is clear if p = 0. Also if – 1 = ∑n2 aj2 where the ajF, then

images

Hence we may assume that p ≠ 0 and that Q = ∑n1 xi2 is anisotropic in F and hence in F(x). We have polynomials f0(x),f1(x), …, fn(x) such that f0(x) ≠ 0 and

images

Let f0(x) be of minimum degree for such polynomials and assume that deg f0(x) > 0 write.

images

where deg ri(x) < deg f0(x) (we take deg 0 = – ∞). Let Q′ be the quadratic form such that Q′ ~ diag { – p, 1,…, 1} with n 1’s. Then (16) gives Q′(f) = 0 where f = (f0,… ,fn). The result holds if Q′(g) = 0 for g = (g0 = l,…, gn), so assume that Q′ (g) ≠ 0. This implies that f and g are linearly independent over F(x). Then h = Q'(g,g)f – 2Q(f,g)g ≠ 0 and Q′ (h) = 0. If h0 = 0, then Q represents 0 nontrivially in F(x), hence in F contrary to the hypothesis. Hence h0 ≠ 0. The hi are polynomials in x and

images

Then deg h0 < deg f0 , which is a contradiction.)

        2. (Cassels.) Let F, x be as in exercise 1 and let x2 + d, for dF, be a sum of n > 1 squares in F(x). Show that either – 1 is a sum of n – 1 squares in F or d is a sum of n squares in F.

        3. (Cassels.) Use exercises 1 and 2 to show that if R is real closed, then x12 + … + xn2 is not a sum of n – 1 squares in R(x1, …, xn).

        4. (T. Motzkin.) Let R be real closed, x and y indeterminates, and let

images

Verify that

images

so p(x, y) is a sum of four squares in R(x)[y]. Show that p(x, y) is not a sum of squares in R[x, y].

        5. Show that any algebraic extension of a Ci-field is a Ci-field.

11.7   ARTIN-SCHREIER CHARACTERIZATION OF REAL CLOSED FIELDS

We conclude our discussion of real closed fields by establishing a beautiful characterization of these fields that is due to Artin and Schreier. We have seen that a field R is real closed if and only if images images R and C = R(images) is algebraically closed (Theorem 11.3, p. 634). Artin and Schreier prove the following considerably stronger theorem:

THEOREM 11.14.   Let C be an algebraically closed field, R a proper subfield of finite codimension in C ([C: R] < ∞). Then R is real closed and C = R(images).

We prove first some elementary lemmas on fields of characteristic p ≠ 0. The first two of these were given in BAI, but for convenience we record again the statements and proofs.

LEMMA 1.   Let F be a field of characteristic p, a an element of F that is not a pth power. Then for any e ≥ 1, the polynomial xpe – a is irreducible in F[x].

Proof. If E is a splitting field of xpea, then we have the factorization xpea = (xr)pe in E[x]. Hence if g(x) is a monic factor of xpea in F[x], then g(x) = (xr)k, k = deg g(x). Then rkF and rpe = aF. If pf = (pe, k) there exist integers m and n such that pf = mpe + nk. Then rpf = images. If k < pe, then f < e and if b = rpf, then bpef = a, contrary to the hypothesis that a is not a pth power in F.images

LEMMA 2.   If F is a field of characteristic p and aF is not of the form upu, uF, then xp – x – a is irreducible in F[x].

Proof.   If r is a root of xpxa in E[x], E a splitting field, then r + 1, r + 2, … , r + (p – 1) are also roots of xpxa. Hence we have the factorization

images

in E[x]. If g(x) = xkbxk – 1 + … is a factor of xpxa, then kr + l1 = b where l is an integer. Hence k < p implies that rF. Since rpr = a, this contradicts the hypothesis.images

LEMMA 3.   Let F and a be as in Lemma 2 and let E be a splitting field for xp – x – a. Then there exists an extension field K/E such that [K:E] = p. (Compare Theorem 8.32, p. 510.)

Proof. We have E = F(r) where rp = r + a. We claim that the element arp – 1E is not of the form upu, uE. For, we can write any u as u0 + u1r + … + up – 1rp – 1, uiF, and the condition upu = arp – l and the relation rp = r + agive

images

Since (1, r, …, rp – 1) is a base, this gives the relation upp – 1up – 1 = a, contrary to the hypothesis on a. It now follows from Lemma 2 that xpxarp – 1 is irreducible in E[x]. Hence if K is the splitting field over E of this polynomial, then [K:E] = pimages

We can now give the

Proof of Theorem 11.14. Let C′ = R(images) ⊂ C. We shall show that C′ = C. Then the result will follow from Theorem 11.3. Now C is the algebraic closure of C; hence any algebraic extension of C′ is isomorphic to a subfield of C/C′ and so its dimensionality is bounded by [C:C′]. The first conclusion we can draw from this is that C′ is perfect. Otherwise, the characteristic is a prime p and C′ contains an element that is not a pth power in C′. Then by Lemma 1, there exists an algebraic extension of C′ that is pe-dimensional for any e ≥ 1. Since this has been ruled out, we see that C′ is perfect. Then C is separable algebraic over C′ and since C is algebraically closed, C is finite dimensional Galois over C′. Let G = Gal C/C′, so |G| = [C: C′ ].

Now suppose that CC′. Then |G| ≠ 1. Let p be a prime divisor of |G|. Then G contains a cyclic subgroup H of order p. If E is the subfield of H-fixed elements, then C is p-dimensional cyclic over E. If p were the characteristic, then C = E(r) where the minimum polynomial of r over E has the form xpxa(BAI, p. 308). Then by Lemma 3, there exists a p-dimensional extension field of C. This contradicts the fact that C is algebraically closed. Hence the characteristic is not p, and since C is algebraically closed, C contains p distinct pth roots of unity. These are roots of xp – 1 = (x – l)(xp – 1 + xp – 2 + … + 1) and since the irreducible polynomials in E[x] have degree dividing [C: E] = p, it follows that the irreducible factors of xp – 1 in E[x] are linear and hence the p pth roots of unity are contained in E. Then C = E(r) where the minimum polynomial of r over E is xp – a, aE (BAI, p. 308). Now consider the polynomial xp2a. This factors as images where u is a primitive p2-root of unity and Sp2 = a. If any uisE, then (uis)pE and ((uis)p)p = a, contrary to the irreducibility of xp2a in E[x]. It follows that the irreducible factors of xp2a in E[x] are of degree p. If b is the constant term of one of these, then b = spv where v is a power of u. Since (sp)p = a, sp images E and since [C: E] = p, C = E(sp) = E(bsp) = E(v). Since E contains all the pth roots of unity, it follows that v is a primitive p2-root of unity.

Let P be the prime field of C and consider the subfield P(v) of C. If P = images, we know that the cyclotomic field of prth roots of unity has dimensionality φ(pr) over images (BAI, p. 272). This number goes to infinity with r. If P is of finite characteristic l, then we have seen that lp. Then the field of prth roots of unity over P contains at least pr elements, so again its dimensionality over P tends to infinity with r. Thus in both cases it follows that there exists an r such that P(v) contains a primitive prth root of unity but no primitive pr+1 st root of unity. Since v is a primitive p2-root of unity, r ≥ 2. The field C contains an element w that is a primitive pr+1 st root of unity. We now consider the cyclotomic field P(w). Let K = Gal P(w)/P. If P is finite, then we know that K is cyclic (BAI, p. 291). The same thing holds if P = images unless p = 2 and r ≥ 2. If K is cyclic, then it has only one subgroup of order p and hence, by the Galois correspondence, P(w) contains only one subfield over which it is p-dimensional. We shall now show that P(w) has two such subfields. This will imply that p = 2 and the characteristic is 0.

Let h(x) be the minimum polynomial of w over E. Since v images E, w images E and C = E(w). Hence deg h(x) = p. Moreover, h(x) is a divisor of xpr+1 – 1 = images, so the coefficients of h(x) are contained in the subfield D = EP(w). It follows that [P(w):D] = p. Next consider the subfield D′ = P(z) where z = wp. The element w is a root of xpzD′[x] and this polynomial is either irreducible or it has a root in D′ (BAI, exercise 1 on p. 248). In the first case [P(w):D′] = p. In the second case, since z is a primitive prth root of unity, D′ contains p distinct pth roots of unity and hence xpz is a product of linear factors in D′[x]. Then wD′ and D′ = P(w). But P(v) contains a primitive prth root of unity and hence P(v) contains z, so if D′ = P(w), then P(v) will contain a primitive pr + 1st root of unity w, contrary to the choice of r. Thus [P(w): D′] = p. Now D′D. Otherwise, D contains a primitive prth root of unity and E contains a primitive prth root of unity. Then E contains w, contrary to the fact that [C:E] = [E(v):E] = p. Thus D and D′ are distinct subfields of P(w) of codimension p in P(w). As we saw, this implies that the characteristic is 0 and p = 2. Now C = E(v) and v is a primitive 22 = 4th root of unity. Hence v = ± images. Since E contains images, we contradict [C: E] = 2. This completes the proof.images

REFERENCES

E. Artin and O. Schreier, Algebraische Konstruktion reele Korper, Abhandlungen Mathematische Seminar Hamburgischen Universitat, vol. 5 (1927), 85–99.

E. Artin, Uber die Zerlegung definiter Funktionen in Quadrate, ibid., 100–115.

T. Y. Lam, The Algebraic Theory of Quadratic Forms, W. A. Benjamin, Reading, Mass., 1973.