are matrices. In this book, we always write matrices with square brackets “\(\,[\,\cdots\,]\,\)”. We have already used matrices in two ways: to keep track of the coefficients when solving a system of linear equations and to describe vectors (as a column of numbers). In both of these cases, matrices were a notational tool used to keep related numbers together. But, as we will soon see, matrices are also mathematical objects that you can do arithmetic with.
SectionC.1Matrix Notation
A matrix can be described by its shape 1
Other terms for the shape of a matrix include the “size of a matrix” and the “dimensions of a matrix”. But, be careful not to confuse this usage of “dimension” with the term “dimension” in the context of subspaces.
(the number of rows and columns in the matrix) and its entries (the numbers inside the matrix). Traditionally, matrices are labeled with capital letters and their entries are labeled with lower-case letters.
Consider the matrix \(A\) with \(m\) rows and \(n\) columns:
FigureC.1.1.
We call \(A\) an “\(m\) by \(n\) matrix”, or in notation, an “\(m\times n\) matrix” 2
In this context, “\(\,\times\,\)” is read as “by”
. The entries of \(A\) are indexed by their \((\text{\emph{row}},\text{\emph{column}})\) coordinates. So, the \((1,1)\) entry of \(A\) is \(a_{11}\text{,}\) the \((2,1)\) entry of \(A\) is \(a_{21}\text{,}\) etc.. When subscripting a matrix entry, it is tradition to omit a separator between the row and column index. That is, we write \(a_{ij}\) instead of \(a_{i,j}\) or \(a_{(i,j)}\) 3
There’s nothing wrong with including a separator. It’s just not common practice.
.
ExampleC.1.2.
Let \(A=\mat{1&2&3\\4&5&6}\text{.}\) Find the shape of \(A\) as well as the \((1,3)\) entry of \(A\text{.}\)
Solution.
\(A\) has two rows and three columns, so \(A\) is a \(2\times 3\) matrix. The \((1,3)\) entry of \(A\) is the number in the first row and third column of \(A\text{,}\) which is \(3\text{.}\)
Since a matrix is completely determined by its shape and entries, we can define a matrix via a formal. For example, define \(B\) to be the \(2\times 3\) matrix whose \((i,j)\) entry, \(b_{ij}\text{,}\) satisfies the formula \(b_{ij}=i+j\text{.}\) In this case
The shorthand \(B=[b_{ij}]\) means that “\(B\) is a matrix whose \((i,j)\) entry is \(b_{ij}\)”. Using this shorthand, we could alternatively say \(B=[b_{ij}]\) is a \(2\times 3\) matrix satisfying \(b_{ij}=i+j\text{.}\)
ExampleC.1.3.
Let \(C=[c_{ij}]\) be a \(3\times 3\) matrix satisfying \(c_{ij}=i-j\text{.}\) Write down \(C\text{.}\)
A matrix has three special parts: the diagonal, the upper triangle, and the lower triangle.
FigureC.2.1.
Formally, we define the diagonal and upper/lower triangle of a matrix in terms of the row and column coordinates.
DefinitionC.2.2.Diagonal.
The diagonal of an \(m\times n\) matrix \(A=[a_{ij}]\) consists of the entries \(a_{ij}\) satisfying \(i=j\text{.}\)
DefinitionC.2.3.Upper & Lower Triangle.
Let \(A=[a_{ij}]\) be an \(m\times n\) matrix. The upper triangle of \(A\) consists the entries \(a_{ij}\) satisfying \(j\geq i\text{.}\) The lower triangle of \(A\) consists of the entries \(a_{ij}\) satisfying \(j\leq i\text{.}\)
SectionC.3Special Matrices
There are several special matrices that come up often.
DefinitionC.3.1.Triangular Matrices.
A matrix is called upper triangular if all non-zero entries lie in the upper triangle of the matrix and a matrix is called lower triangular if all non-zero entries lie in the lower triangle. A matrix is called triangular if it is either upper or lower triangular.
DefinitionC.3.2.Square Matrix.
A matrix is called square if it has the same number of rows as columns.
DefinitionC.3.3.Diagonal Matrix.
A square matrix is called diagonal the only non-zero entries in the matrix appear on the diagonal.
DefinitionC.3.4.Symmetric Matrix.
The square matrix \(A=[a_{ij}]\) is called symmetric if its entries satisfy \(a_{ij}=a_{ji}\text{.}\)
Alternatively, if the entries of \(A\) satisfy \(a_{ij}=-a_{ji}\text{,}\) then \(A\) is called skew-symmetric or anti-symmetric.
DefinitionC.3.5.Zero Matrix.
A matrix is called a zero matrix if all its entries are zero.
DefinitionC.3.6.Identity Matrix.
An identity matrix is a square matrix with ones on the diagonal and zeros everywhere else. The \(n\times n\) identity matrix is denoted \(I_{n\times n}\text{,}\) or just \(I\) when its size is implied.
ExampleC.3.7.
Identify the diagonal of \(A=\mat{-2&3\\5&6\\7&7}\text{.}\)
Solution.
The diagonal of \(A\) consists of the entries in \(A\) whose row coordinate is equal to the column coordinate. So, the diagonal of \(A\) consists of \(-2\) and \(6\text{.}\)
ExampleC.3.8.
Apply a single row operation to \(B=\mat{-2&3\\0&6\\0&12}\) to make it an upper triangular matrix.
Solution.
To make \(B\) an upper triangular matrix, we need all entries below the diagonal to be zero. More specifically, we need to get rid of the \(12\) in the lower-right corner of \(B\text{.}\) By applying the row operation \(\text{row}_{3}\mapsto \text{row}_{3}-2\,\text{row}_{2}\) to \(B\text{,}\) we get the upper triangular matrix
If possible, produce a matrix that is both upper and lower triangular.
Solution.
For a matrix to be upper triangular, all entries below the diagonal must be zero. For a matrix to be lower triangular, all entries above the diagonal need to be zero. Therefore, if a matrix is both upper and lower triangular, the only non-zero entries of the matrix must be on the diagonal. It follows that
In Module 1, we saw that vectors were an extension of numbers that allowed us to describe directions. Similarly, we can view matrices as a more general type of “number”. Matrices can do everything numbers can, everything vectors can, and more!
SubsectionC.4.1Basic Operations
The rules for addition and scalar multiplication of matrices are what you expect: to add two matrices, add the corresponding entries, and to scalar multiply a matrix, distribute the scalar to each entry.
While any matrix can be scalar multiplied by any scalar, matrix addition only makes sense for compatible matrices. That is, you can only add together two matrices of the same shape.
TakeawayC.4.1.
If two matrices are of the same shape, you can add them by adding entries “straight across”; you can multiply a matrix by a scalar by distributing the scalar to each entry of the matrix.
ExampleC.4.2.
Let \(A=\mat{1&2&3\\4&5&6}\text{,}\) let \(B=\mat{1&1\\2&2}\text{,}\) and let \(C=\mat{-1&0&-1\\2&1&2}\text{.}\) Compute \(2A+B\) and \(A+3C\text{,}\) if possible.
Solution.
First, note that \(A\) is a \(2\times 3\) matrix and so it can only be added to another \(2\times 3\) matrix. Since \(B\) is a \(2\times 2\) matrix, \(2A+B\) is not defined. But, \(C\) is a \(2\times 3\) matrix, so \(A+3C\) is defined. Computing,
Matrices and vectors interact via matrix-vector multiplication. There are two equivalent ways to think about matrix-vector multiplication: in terms of columns (the column picture) and in terms of rows (the row picture).
be the vectors corresponding to the columns of \(A\text{.}\) Further, let \(\vec x=\matc{x_1\\x_2\\\vdots\\x_n}\) be a vector.
We define the matrix-vector product \(A\vec x\) to be the linear combination of \(\vec c_{1}\text{,}\)\(\vec c_{2}\text{,}\) …, \(\vec c_{n}\) with coefficients \(x_{1}\text{,}\)\(x_{2}\text{,}\) etc.. That is,
The column picture of matrix-vector multiplication hints that matrix-vector multiplication can be used to encode sophisticated problems involving linear combinations (see Module 7 for details).
be vectors corresponding to the rows of \(A\text{.}\) Note that we are writing the row vectors of \(A\) in column vector form. Further, let \(\vec x=\matc{x_1\\x_2\\\vdots\\x_n}\) be a vector.
We alternatively define the matrix-vector product \(A\vec x\) as the vector whose coordinates are the dot products of the rows of \(A\) and the vector \(\vec x\text{.}\) That is,
Let \(B=\mat{1&2\\-2&3}\) and let \(\vec v=\mat{4\\3}\text{.}\) Compute \(B\vec v\) using the row picture. Verify that the result matches with what you get from the column picture.
Solution.
The row vectors of \(B\) are \(\mat{1\\2}\) and \(\mat{-2\\3}\text{,}\) so
FigureC.5.3.
This is the same vector we got in the previous example using the column picture!
Since the row picture of matrix-vector multiplication involves dot products, which in turn relate to angles and geometry, the row picture hints that matrix-vector multiplication can be used to encode sophisticated problems involving the angles between multiple vectors (see Module 7 for more).
SubsectionC.5.3Compatibility
Matrix-vector multiplication is only possible when the shape of the matrix is compatible with the size of the vector. That is the number of columns of the matrix must match the number of coordinates in the vector. (Try some examples using the row an column picture to make sure you agree.)
The result of a matrix-vector product is always a vector, but the number of coordinates in the output vector can change. For example, if \(M\) is a \(2\times 3\) matrix, the product \(M\vec v\) is only defined if \(\vec v\in \R^{3}\text{.}\) However, the resulting vector \(\vec w=M\vec v\) is in \(\R^{2}\) (try an example and verify for yourself). This means matrix-vector multiplication can be used to move vectors between different spaces!
TakeawayC.5.4.
Let \(A\) be an \(m\times n\) matrix and \(\vec x\) be a vector. The matrix-vector product \(A\vec x\) is only defined if \(\vec x\) has \(n\) coordinates. In that case, the result is a vector with \(m\) coordinates.
SectionC.6Matrix-Matrix Multiplication
In many circumstances, we can also multiply two matrices with each other. To do so, we repeatedly apply matrix-vector multiplication. Let \(C\) and \(A\) be matrices and let \(\vec a_{1}\text{,}\)\(\vec a_{2}\text{,}\)\(\ldots\text{,}\)\(\vec a_{k}\) be the columns of \(A\text{.}\) Then,
Here, we “distributed” \(C\) into the matrix \(A\text{,}\) creating a new matrix whose columns are \(C\vec a_{1}\text{,}\)\(C\vec a_{2}\text{,}\) …. Using the row picture to expand each \(C\vec a_{i}\text{,}\) we arrive at an explicit formula. Let \(\vec r_{1}\text{,}\)\(\vec r_{2}\text{,}\)\(\ldots\text{,}\)\(\vec r_{m}\) be the rows of \(C\text{.}\) Then,
Let \(X=\mat{1&2&3\\0&-1&0}\) and \(Y=\mat{2&3\\1&1\\1&0}\text{.}\) Compute \(XY\) and \(YX\text{.}\)
Solution.
Computing \(XY\) entry by entry, we get the \((1,1)\) entry is \(\mat{1\\2\\3}\cdot\mat{2\\1\\1}=7\text{,}\) the \((2,1)\) entry is \(\mat{0\\-1\\0}\cdot\mat{2\\1\\1}=-1\text{,}\) and so on. Computing all the entries we get
Computing \(YX\) entry by entry, we get the \((1,1)\) entry is \(\mat{2\\3}\cdot\mat{1\\0}=2\text{,}\) the \((2,1)\) entry is \(\mat{1\\1}\cdot\mat{1\\0}=1\text{,}\) and so on. Computing all the entries we get
From the previous example, we see that multiplying matrices in different orders can produce different results. Formally we say that matrix multiplication is not commutative (in contrast, scalars can be multiplied in any order). This non-commutativity holds even for square matrices 1
Of course, it is possible that \(AB=BA\) for matrices \(A\) and \(B\text{.}\) It just doesn’t happen very often.
Further, for a matrix-matrix multiplication to be possible, the shapes of each matrix must be compatible. Using our knowledge of matrix-vector multiplication, we can deduce that if the matrix-matrix product \(CA\) makes sense, then the number of columns of \(C\) must match the number of rows of \(A\text{.}\)
Writing the shape of two matrices side-by-side allows for a quick compatibility check.
\begin{equation*}
\text{rows of $C$}\times\underbrace{\text{columns of $C$}\qquad \text{rows of $A$}}_{\text{must be equal for $CA$ to exist}}\times \text{columns of $A$}
\end{equation*}
A successful matrix-matrix multiplication will always result in a matrix with the number of rows of the first and the number of columns of the second.
\begin{equation*}
\underbrace{ {\text{rows of $C$}}\times\text{columns of $C$}\qquad \text{rows of $A$}\times {\text{columns of $A$}}}_{ \text{successful product will be a ${\text{rows of $C$}}\!\times{\text{columns of $A$}}$ matrix} }
\end{equation*}
ExampleC.6.2.
Let \(A\) be a \(2\times 3\) matrix, let \(B\) be a \(3\times 4\) matrix, and let \(C\) be a \(1\times 3\) matrix. Determine the shape of the matrices resulting from all possible products of \(A\text{,}\)\(B\text{,}\) and \(C\text{.}\)
Solution.
For the product of two matrices to exist, the number of columns of the first matrix must equal the number of rows in the second. Therefore, the only matrix products that are possible are \(AB\) and \(CB\text{.}\)
\(AB\) is the product of a \(2\times 3\) matrix with a \(3\times 4\) matrix, and so will be a \(2\times 4\) matrix.
\(CB\) is the product of a \(1\times 3\) matrix with a \(3\times 4\) matrix, and so will be a \(1\times 4\) matrix.
SectionC.7Matrix Algebra
Let \(A\text{,}\)\(B\text{,}\) and \(C\) be \(n\times n\) matrices and let \(\alpha\) be a scalar. We can now write algebraic expressions like
\begin{equation*}
A(B+\alpha C).
\end{equation*}
Since the matrices are all \(n\times n\text{,}\) such expressions are always defined and the results are again \(n\times n\) matrices. We can almost treat arithmetic with \(n\times n\) matrices like arithmetic with numbers, save the fact that changing the order of multiplication might change the result. Many familiar properties of arithmetic carry over to matrices. For example, matrix multiplication is both associative and distributive. That is,
We’re already familiar with the special matrices \(I\text{,}\) the identity matrix, and \(\mathbf{0}\text{,}\) the zero matrix. In terms of matrix algebra, these behave like the numbers \(1\) and \(0\text{.}\) That is,
\begin{equation*}
IA=AI=A\qquad \text{and}\qquad \mathbf{0} A = A\mathbf{0}=\mathbf{0}
\end{equation*}
for any compatible square matrix \(A\text{.}\)
To kick it up a level, when working with square matrices, we can define polynomials of matrices. Using familiar exponent notation, \(A^{2}=AA\text{,}\) we can formulate questions like
Does the equation \(A^{2}=-I\) have a \(2\times 2\) matrix solution?
Famously, the equation \(x^{2}=-1\) has no real solutions, but \(A^{2}=-I\) actually does have real \(2\times 2\) matrix solutions (see if you can find one)! In this text we will only scratch the surface of what can be done with matrix algebra, but it’s powerful stuff 1
Galois theory and representation theory both heavily rely on matrix algebra.
.
InsightC.7.1.
Matrix algebra behaves a lot like regular algebra except that the order of multiplication matters and matrices must always have compatible sizes.
SectionC.8More Notation
Linear algebra has many different products: scalar multiplication, dot products, matrix-vector products, and matrix-matrix products, to name a few. To distinguish between these different products, we use different notations.
For matrix-vector and matrix-matrix products, we use adjacency to represent multiplication. That is, we write
\begin{equation*}
A\vec v\qquad\text{and}\qquad AB
\end{equation*}
to indicate a product. Specifically, we do not use the symbols “\(\,\cdot\,\)” or “\(\,\times\,\)” to represent matrix-vector or matrix-matrix products (these symbols are reserved for the dot product and cross product, respectively).
ExercisesC.9Exercises
1.
For each description below, if possible, create a matrix matching the description. Otherwise, explain why such a matrix doesn’t exist.
A \(2\times 2\) diagonal matrix whose entries sum to \(-1\text{.}\)
A \(2\times 2\) symmetric matrix whose entries sum to \(-1\text{.}\)
A \(4\times 2\) symmetric matrix whose entries sum to \(-1\text{.}\)
A \(3\times 3\) skew-symmetric matrix whose entries sum to \(-1\text{.}\)
A \(1\times 4\) matrix \(A=[a_{ij}]\) whose entries satisfy \(a_{ij}=\sqrt{i+j}\text{.}\)
Solution.
Note: there are many matrices possible.
\(\displaystyle \mat{1&0\\0&-2}\)
\(\mat{1&1\\1&-4}\) or \(\mat{1&0\\0&-2}\)
Impossible. A symmetric matrix must be square.
Impossible. Let \(A=[a_{ij}]\) be a skew symmetric matrix. By definition, \(a_{ij}=-a_{ji}\text{.}\) In particular, the diagonal entries satisfy \(a_{kk}=-a_{kk}\) and so must be zero. For every other entry \(a_{ij}\) with \(i\neq j\text{,}\) there exists a corresponding entry \(-a_{ji}\text{.}\) Therefore the sum of all entries must be zero.
Consider the following matrices \(A=\mat{1 & 0 & 0 \\ 1 & 0 & 2 \\ 1 & 6 & 5}\text{,}\)\(B=\mat{2 & 1 & 1 \\ 0 & 2 & 0 \\ 0 & 0 & 3}\text{,}\)\(C=\mat{1\\-1\\2}\text{,}\) and \(D=\mat{0&-2&1}\text{.}\) For each of the following, (i) determine if the operation is defined, and (ii) compute the result using both the column picture and row picture of multiplication (if applicable).
\(\displaystyle AC\)
\(\displaystyle 2A+B\)
\(\displaystyle A-B\)
\(\displaystyle CA\)
\(\displaystyle AB\)
\(\displaystyle BA\)
\(\displaystyle DC\)
\(\displaystyle CD\)
Solution.
\(\displaystyle \mat{1\\5\\5}\)
\(\displaystyle \mat{4&1&1\\2&2&4\\2&12&13}\)
\(\displaystyle \mat{-1&-1&-1\\1&-2&2\\1&6&2}\)
Not defined
\(\displaystyle \mat{2&1&1\\2&1&7\\2&13&16}\)
\(\displaystyle \mat{4&6&7\\2&0&4\\3&18&15}\)
\(\displaystyle \mat{4}\)
\(\displaystyle \mat{0&-2&1\\0&2&-1\\0&-4&2}\)
3.
In general, matrix multiplication is non-commutative. However, some types of matrices are special.
Let \(A\) and \(B\) be \(2\times 2\) diagonal matrices and let \(X\) and \(Y\) be \(n\times n\) diagonal matrices.
Show by direct computation that \(AB=BA\text{.}\)
Show that both \(XY\) and \(YX\) also diagonal matrices.
Is it true that \(XY=YX\) no matter \(n\text{?}\) Explain.
4.
Classify the following statements as true or false.
A matrix in reduced row echelon form is an upper triangular matrix.
A diagonal matrix is in reduced row echelon form.
Every zero matrix is also square.
A zero matrix is neither upper or lower triangular.
A matrix that is both upper triangular and lower triangular must be diagonal.
Using row operations every lower triangular matrix can be converted into an upper triangular matrix.
The product of two lower triangular matrices is a lower triangular matrix (provided the product is defined).
Solution.
True
False. Consider \(\mat{0&0\\0&1}\text{.}\)
False. Zero matrices can be of any size.
False. All zero matrices are both upper and lower triangular.
False. Diagonal matrices must be square; upper/lower triangular matrices can be of any size.
True
True
5.
Let \(R\) be a \(1\times n\) matrix and let \(C\) be an \(n\times 1\) matrix.
Is the product \(RC\) defined? If so, what is its shape?
Is the product \(CR\) defined? If so, what is its shape?
Let \(\vec r\) be the (only) row vector in \(R\) and let \(\vec c\) be the (only) column vector in \(C\text{.}\) Are \(\vec r\cdot \vec c\) and \(RC\) the same? Explain.
Let \(\vec x,\vec y\in \R^{n}\) and let \(R_{\vec x}\) be the \(1\times n\) matrix with \(\vec x\) as a row vector and let \(C_{\vec y}\) be the \(n\times 1\) matrix with \(\vec y\) as a column vector. The inner product of \(\vec x\) and \(\vec y\) is defined to be \(R_{\vec x}C_{\vec y}\text{.}\) The outer product of \(\vec x\) and \(\vec y\) is defined to be \(C_{\vec y}R_{\vec x}\text{.}\)
How does the inner product of \(\vec x\) and \(\vec y\) relate to the dot product of \(\vec x\) and \(\vec y\text{?}\)
Let \(Q\) be the outer product of \(\vec x\) and \(\vec y\text{.}\) What does the reduced row echelon form of \(Q\) look like?
Let \(Q\) be the outer product of \(\vec x\) and \(\vec y\text{.}\) Show that the columns of \(Q\) are always linearly dependent when \(n\geq 2\text{.}\)
Solution.
Yes. \(1\times 1\)
Yes. \(n\times n\)
They are related but not the same. \(\vec r\cdot \vec c\) is a scalar and \(RC\) is a \(1\times 1\) matrix.
The inner product of \(\vec x\) and \(\vec y\) is \([\vec x\cdot \vec y]\text{.}\) That is, it is the \(1\times 1\) matrix with entry \(\vec x\cdot \vec y\text{.}\)
The reduced row echelon form of \(Q\) looks like a (possibly) non-zero row followed by rows of zeros.
Let \(y_{1},\ldots, y_{n}\in \R\) be the entries in \(\vec y\text{.}\) Then the columns of \(Q\) are \(y_{1}\vec x,\ldots, y_{n}\vec x\text{.}\) These columns are all scalar multiples of the same vector, \(\vec x\text{,}\) and so are linearly dependent.
6.
A \(3\times 3\) matrix is called a Heisenberg matrix if it takes the form \(\mat{1&a&c\\0&1&b\\0&0&1}\) for some \(a,b,c\in\R\text{.}\)
Show that if \(A\) and \(B\) are Heisenberg matrices, then so are \(AB\) and \(BA\text{.}\)
If \(A\) and \(B\) are Heisenberg matrices, is it always the case that \(AB=BA\text{?}\) Give a proof or a counter example.
Let \(X=\mat{1&a&c\\0&1&b\\0&0&1}\) and let \(Y=\mat{1&-1&ab-c\\0&1&-b\\0&0&1}\text{.}\) Show that \(XY=I_{3\times 3}\text{.}\)
By multiplying out, we see \(XY=I\) (and \(YX=I\)).
7.
Let \(X\) be a matrix of the form \(\mat{a&-b\\b&a}\) for some \(a,b\in\R\text{.}\)
Show that \(X^{2}\) has the same form as \(X\text{.}\)
Is there a solution to the matrix equation \(X^{2}=I_{2\times 2}\text{?}\) If so, how many?
Is there a solution to the matrix equation \(X^{2}=-I_{2\times 2}\text{?}\) If so, how many?
Let \(Y\) be an arbitrary \(2\times 2\) matrix. How many solutions are there to the equation \(Y^{2}=I_{2\times 2}\text{?}\)
Do you agree with the statement “every positive real number has exactly two square roots”? Do you agree with the statement “every diagonal matrix with positive entries on the diagonal has exactly two square roots”? Explain.
implies that \(a=0\) or \(b=0\text{.}\) If \(a=0\text{,}\) then \(-b^{2}=1\text{,}\) which is impossible. Therefore \(b=0\text{.}\) This means \(a^{2}=1\) which has solutions \(a=\pm 1\text{.}\) Therefore there are exactly two solutions to \(X^{2}=I\text{.}\)
implies that \(a=0\) or \(b=0\text{.}\) If \(a=b\text{,}\) then \(a^{2}=-1\text{,}\) which is impossible. Therefore \(a=0\text{.}\) This means \(-b^{2}=-1\) which has solutions \(b=\pm 1\text{.}\) Therefore there are exactly two solutions to \(X^{2}=-I\text{.}\)
There are infinitely many solutions to \(Y^{2}=I\text{.}\) For example \(\mat{0&t\\1/t&0}^{2}=I\) for any non-zero \(t\text{.}\)
Yes to the first, no to the second. Matrices are more general than numbers!