Curiosity demands that we persevere with our example of geometric matrix products to see what happens when we compose the mappings involved. Specifically, let R now denote the transformation that rotates each point about the origin through one quarter turn (90°) anticlockwise, and let S denote reflection in the x-axis. Using our matrix representation, we then have
We look to find the transformations that can be generated by R and S, meaning the mappings that we can produce by composing R and S with each other, in any order, and any number of times.
First it is clear that, since S is a reflection, if we act with S and act with S again (we write this as S2), the net effect is to return every point to its original position. In terms of matrices, we have
and this latter matrix is called the identity matrix, denoted by I. The effect of multiplying I by any column vector is to return the same column vector, and so we say that I represents the identity transformation that leaves every point fixed. More generally, we speak of the n × n matrix I = In, which is the diagonal matrix with a principal diagonal that consists entirely of 1’s. For any matrix A of a size that allows matrix multiplication, we always have AI = A = IA, so that I behaves with respect to matrix multiplication like the multiplicative identity 1 does with respect to ordinary multiplication.
In symbols, we have that S2 = I and we say that S is its own inverse matrix. In general, for a square matrix A, we say that another square matrix B is the inverse of A if AB = BA = I and we write B = A−1 for, it exists, the inverse of A is certainly unique. To see this, suppose that we had two inverses, B and C, of A, so that BA = CA = I. By multiplying this equation on the right by B we then get
It follows from the symmetry of the definition that A is equally the inverse of B, and so A = B−1 also.
However, not every square matrix has an inverse: for example, a square zero matrix Z, which is a matrix where all entries are 0, cannot have an inverse, as for any matrix A of the same size we have ZA = AZ = Z: Z is the matrix analogue of the number 0. However, there are also other singular matrices, which mean matrices without inverses, and we shall have more to say about this.
We can see directly, however, that our matrix R does have an inverse. We only need ask ourselves, how can we undo the effect of acting with R? Since R represents an anticlockwise quarter turn, we just need the matrix that turns the plane through a quarter turn in the clockwise direction. We can also achieve this by turning through three quarters of a full turn in the anticlockwise direction. The reason to draw attention to this alternative is that this latter transformation will come from acting with R three times or, in terms of our new notation, R−1 = R3:
We have thus discovered that while there are just two distinct powers of S, there are four for our rotation R, giving us a group of transformations (for it will transpire that it is a group), that being H = {I, R, R2, R3}.
Recall from Chapter 6 that a group is a set, together with an associative binary operation, which possesses an identity element with respect to which each member of the group has an inverse. In the case of H, the operation is composition of transformations or, equivalently, multiplication of the corresponding matrices. Matrix multiplication is indeed associative, the identity element I is represented by the identity matrix, and, crucially, each member A of the set possesses an inverse, A−1, as we shall now check.
In the case of H, I is self-inverse (which is always the case), as is R2 for we have R2 ⋅ R2 = R4 = I, and we have already noted that R ⋅ R3 = I, showing that R and R3 are mutual inverses. Therefore H is an example of a group. However, H is simply a 4-cycle meaning that H just consists of four powers of R, with R5 = R, R6 = R2, and so on, and therefore H is a copy of ℤ4, the group of integers modulo 4. Nonetheless, H is a part of (we say H is a subgroup of) a larger eight-element group, D. The other four members of D can be generated by taking the product of each member of H with S, which gives {S, RS, R2S, R3S}. By calculating the matrices representing each of these four mappings or by considering their geometric actions, we can see that each of these products represents a reflection in a line through the origin. Specifically, S, RS, R2S, R3S correspond to reflections in the lines y = 0, y = x, x = 0, and y = −x respectively, these four reflections being self-inverse mappings. (Rotating each of these lines by 45° anticlockwise takes you from one to the next.)
Table 1. Multiplication of the group of symmetries generated by R and S.
However, no more than these eight mappings may be obtained, so it follows that D forms a group of transformations, with each of the four reflections being self-inverse. The table of the group operation is displayed in Table 1, where the entry at position (i, j) in the body of the table is the product αβ of the matrix α in row i times the matrix β in column j.
Any product of any length is equal to one of these eight mappings. Indeed any product, however long, can be simplified to one of the eight given forms by using just the rules R4 = S2 = I and SR = R3S. Group theorists would say that the group D is presented by the generators R and S, subject to these three relations. For example, making free use of these equations, we get
A key observation about D, known as the dihedral group, is that its operation is not commutative: SR = R3S ≠ RS. A more general observation that applies to every group is that its table of products displays the Latin square property, meaning that every row and every column features each member of the group exactly once. These facts help us complete the table.
The cyclic group H has been seen before in another guise, for the powers of the imaginary unit i are four in number, i.e. G = {1, i, −1, −i}, and correspond to the list H = {1, R, R2, R3 = R−1} in that order: we say that the two groups G and H are isomorphic and that the correspondence 1 ↦ 1, R ↦ i, R2 ↦ −1, R−1 ↦ −i is an isomorphism between G and H, meaning that their tables of products are identical under this renaming of elements.
The purpose of the word ‘isomorphism’ is to convey the fact that although the two groups are different, the first being a collection of complex numbers under multiplication while the second is a group of rotations under the operation of composition, the two groups are essentially the same as regards their algebraic structure. This is not a mere coincidence, as multiplication by the imaginary unit i acts to rotate each number in the complex plane through a quarter of a turn about the origin. To see this we need only note that for any complex number z = x + iy, we have iz = xi + i2y = −y + ix: in terms of points in the Argand plane, (x, y) ↦ (−y, x), and the latter is the point we arrive at by turning (x, y) through 90° anticlockwise about the origin. This link lets us identify multiplication by i with a turn of one right angle in the plane, thereby reaffirming the fundamental nature of the imaginary unit.
If a collection of matrices represent the elements of a group, such as the eight matrices that represent the dihedral group D, then each of these matrices A will have an inverse A−1, such that AA−1 = A−1A = I, the identity matrix. This prompts the twin questions of when the inverse of a square matrix A exists and, if it does, how to find it. First we make a few simple observations about inverses in general.
A very useful rule is that the inverse of a product of two square matrices is the product of the inverses in inverse order, which is to say that (AB)−1 = B−1A−1, as is verified at once from the definition:
It is worth noting that the manipulations represented by (33) hold in any group: the inverse of a product is the product of the inverses with the order reversed. It is important to respect this reversal of order, as for matrix multiplication, and for group products in general, order matters.
There is an important but simple operation akin to taking the inverse that applies to any matrix A, which is that of taking its transpose, AT: the rows of AT written from left to right are the columns of A written from top to bottom. (It is sometimes convenient to write a column vector using the transpose symbol, for example (2, −1, 4)T). As an example,
(34)
We are entitled to use the double arrow ‘⇔’ in (34) as, just like for a square matrix A it is the case that (A−1)−1 = A, it is true quite generally that (AT)T = A. What is a little less obvious is that the taking of the transpose also obeys the rule (AB)T = BT AT. As we have mentioned previously in connection with incidence matrices of networks, a matrix that equals its own transpose is called symmetric: a symmetric matrix is necessarily square and is one that satisfies aij = aji throughout.
Suppose now that we have n equations in n unknowns, x1, x2, … , xn. We may represent this system by a single matrix equation, Ax = b, where x represents the column of unknowns, b represents the RHS of the equation set, and A is the n × n coefficient matrix. If we already possessed the inverse, A−1, we could solve this system immediately by multiplying both sides of the equation on the left by A−1, for that gives A−1 Ax = A−1b ⇒ x = A−1b. On the other hand, we have the elimination technique to solve such a system, which suggests that we might seek to use that to find how to invert our matrix A.
The operations deployed in the elimination method used to solve a set of simultaneous linear equations affect the coefficient matrix in one of three ways:
Each of these row operations, as they are called, can be effected by multiplying on the left by a suitable matrix. The following examples illustrate this:
For any 2 × 2 matrix A, the product BA outputs A with its second row multiplied by a, C A is the same as A but with the rows interchanged, and multiplying A on the left by D adds a times the second row of A to the first row of A. Specifically, for this last example,
In general, we get the row matrix that effects a particular row operation by applying that row operation to the identity matrix I. Although there is no need to introduce these elementary row matrices in any practical calculation, there is an important consequence to their being there. Suppose that we reduce A to I by a sequence of row operations that have row matrices M1, M2, …, Mm−1, Mm, say, so that we have (MmMm−1 … M2M1)A = I. It follows that the product of the matrices preceding A must equal A−1. We can, however, find A−1 from this analysis without writing down the matrices Mi explicitly. The scheme is often expressed in the following fashion:
where the arrow represents row reduction of operations applied to A to reduce it to I. The n × 2n matrix on the left is known as the augmented matrix of A. Since the row operations used are executed on both the left-hand portion, A, and the right-hand portion, I, of the augmented matrix, the net effect on I will be to multiply it too by A−1, so that the right half of the augmented matrix is indeed transformed into A−1I = A−1, as the scheme suggests.
To illustrate this method, we find the inverse of the 3 × 3 matrix forming the left-hand portion of the augmented matrix below, so that [A|I] in this case is given by
We begin by pivoting on the top left-hand corner, meaning that we clear all other entries in the first column by means of the row operations R2 ↦ R2 − 3R1 (meaning that row 2 is replaced by row 2 minus 3 row 1) and R3 ↦ R3 − 3R1 to give
where we pass to the second matrix by pivoting on the second diagonal entry via the operations R1 ↦ R1 − R2 and R3 ↦ R3 − 3R2. Note that this does not alter any entry in the first column, as only the pivotal entry of the first column is not zero—in this way, the process moves forward and we do not resurrect non-zero entries where they are unwanted. Finally, we clear above the third diagonal pivot to reduce the left-hand portion of the matrix to I3 using the operations R1 ↦ R1 + R3 and R2 ↦ R2 − 2R3, to obtain
It may be readily checked that AA−1 = I, and readers are now in a position to practise some examples of their own, but be warned: the previous example is especially nice—it is rare for a matrix and its inverse to consist entirely of integers. During your calculation, you may sometimes avoid fractions by pivoting on diagonal entries that are not 1, but, of course, if the answer has fractions in it they are bound to make their appearance sooner or later. An integer matrix (one with all integer entries) has an integer matrix inverse if and only if its determinant is ±1: we shall justify this claim in Chapter 9. However, necessary and sufficient conditions for a square matrix A to possess an inverse may be given just in terms of the notion of rank, which we introduced in Chapter 7, and we can now outline how this comes about.
A square matrix A is invertible if and only if A is of full rank. If A is not of full rank, we can use the dependency within the rows to reduce A by row operations to a matrix C with a row of zeros, and such a matrix must be singular, as that row of zeros will persist in any product of the form C B and so cannot give the identity matrix. Since each row operation is invertible, it can be inferred that A too has no inverse and so only matrices of full rank are invertible.
Conversely, any square matrix A of full rank can be inverted. This is because in that case the reduction process that takes [A|I] to [I|A−1] will never get stuck—the fact that A is of full rank ensures that a zero that arises in a pivot position can be exchanged for a non-zero entry and the process will continue, eventually yielding the inverse of A in the reduced augmented matrix [I|A−1]. Therefore, provided that A is of full rank, its inverse exists and may be found in this way. In the next chapter we shall see that another characterization of invertibility is possible in terms of a single number associated with the matrix, its determinant.