How to write a linear transformation in multiple bases.
Given a basis \(\mathcal{A}\) for \(\R^{n}\text{,}\) every vector \(\vec x\in\R^{n}\) uniquely corresponds to the list of numbers \([\vec x]_{\mathcal{A}}\) (its coordinates with respect to \(\mathcal{A}\)), and the operation of writing a vector in a basis is invertible.
If we have two bases, \(\mathcal{A}\) and \(\mathcal{B}\text{,}\) for \(\R^{n}\text{,}\) we have two equally valid ways of representing a vector in coordinates.
Figure13.0.1.
Not only that, but there must be a function that converts between \([\vec x]_{\mathcal{A}}\) and \([\vec x]_{\mathcal{B}}\text{.}\) The function works as follows: input the list of numbers \([\vec x]_{\mathcal{A}}\text{,}\) use those numbers as coefficients of the \(\mathcal{A}\) basis vectors to get the true vector \(\vec x\text{,}\) and then find the coordinates of that vector with respect to the \(\mathcal{B}\) basis.
Example13.0.2.
Let \(\mathcal{A}=\Set{\vec a_1,\vec a_2}\) where \(\vec a_{1}=\mat{1\\1}_{\mathcal{E}}\) and \(\vec a_{2}=\mat{1\\-1}_{\mathcal{E}}\) and let \(\mathcal{B}=\Set{\vec b_1,\vec b_2}\) where \(\vec b_{1}=\mat{2\\1}_{\mathcal{E}}\) and \(\vec b_{2}=\mat{5\\3}_{\mathcal{E}}\) be bases for \(\R^{2}\text{.}\) Given that \([\vec x]_{\mathcal{A}}=\mat{2\\-3}\text{,}\) find \([\vec x]_{\mathcal{B}}\text{.}\)
We need to rewrite \(\vec x\) as a linear combination of \(\vec b_{1}=2\xhat+\yhat\) and \(\vec b_{2}=5\xhat+3\yhat\text{.}\) That is, we need to solve the equation
For a basis \(\mathcal{A}\text{,}\) the invertible function that takes a vector \(\vec x\) and generates the coordinates \([\vec x]_{\mathcal{A}}\) is a linear function. Therefore, for bases \(\mathcal{A}\) and \(\mathcal{B}\text{,}\) the function that converts \([\vec x]_{\mathcal{A}}\) to \([\vec x]_{\mathcal{B}}\) must have a matrix. This matrix is called the change of basis matrix.
Definition13.0.3.Change of Basis Matrix.
Let \(\mathcal{A}\) and \(\mathcal{B}\) be bases for \(\R^{n}\text{.}\) The matrix \(M\) is called a change of basis matrix (which converts from \(\mathcal{A}\) to \(\mathcal{B}\)) if for all \(\vec x\in \R^{n}\)
Notationally, \(\BasisChange{\mathcal{A}}{\mathcal{B}}\) stands for the change of basis matrix converting from \(\mathcal{A}\) to \(\mathcal{B}\text{,}\) and we may write \(M=\BasisChange{\mathcal{A}}{\mathcal{B}}\text{.}\)
Example13.0.4.
Let \(\mathcal{A}=\Set{\vec a_1,\vec a_2}\) where \(\vec a_{1}=\mat{1\\1}_{\mathcal{E}}\) and \(\vec a_{2}=\mat{1\\-1}_{\mathcal{E}}\) and let \(\mathcal{B}=\Set{\vec b_1,\vec b_2}\) where \(\vec b_{1}=\mat{2\\1}_{\mathcal{E}}\) and \(\vec b_{2}=\mat{5\\3}_{\mathcal{E}}\) be bases for \(\R^{2}\text{.}\) Find the change of basis matrix \(\BasisChange{\mathcal{A}}{\mathcal{B}}\text{.}\)
Solution.
We know \(\BasisChange{\mathcal{A}}{\mathcal{B}}\) will be a \(2\times 2\) matrix and that
Therefore, we need to compute \([\vec a_{1}]_{\mathcal{B}}\) and \([\vec a_{2}]_{\mathcal{B}}\text{.}\) Repeating the procedure from the previous example, we find
The notation \(\BasisChange{\mathcal{A}}{\mathcal{B}}\) for the matrix that changes from the \(\mathcal{A}\) basis to the \(\mathcal{B}\) basis is suggestive. Suppose we have another basis \(\mathcal{C}\text{.}\) We can obtain \(\BasisChange{\mathcal{A}}{\mathcal{C}}\) by multiplying \(\BasisChange{\mathcal{A}}{\mathcal{B}}\) on the left by \(\BasisChange{\mathcal{B}}{\mathcal{C}}\text{.}\) That is,
The backwards arrow “\(\leftarrow\)” in the change-of-basis matrix notation comes because when we multiply a vector and a matrix, the matrix is always to the left of the vector. So,
As such, the notation for the change of basis matrix chains, allowing you to figure out what’s going on without too much trouble.
Section13.1Change of Basis Matrix in Detail
Let \(\mathcal{A}\) and \(\mathcal{B}\) be bases for \(\R^{n}\) and \(M=\BasisChange{\mathcal{A}}{\mathcal{B}}\) be the matrix that changes from the \(\mathcal{A}\) to the \(\mathcal{B}\) basis. Since we can change vectors back from \(\mathcal{B}\) to \(\mathcal{A}\text{,}\) we know \(M\) is invertible and
\begin{equation*}
M^{-1}M=\BasisChange{\mathcal{B}}{\mathcal{A}}\BasisChange{\mathcal{A}}{\mathcal{B}}=\BasisChange{\mathcal{A}}{\mathcal{A}}= I\qquad MM^{-1}=\BasisChange{\mathcal{A}}{\mathcal{B}}\BasisChange{\mathcal{B}}{\mathcal{A}}=\BasisChange{\mathcal{B}}{\mathcal{B}}= I,
\end{equation*}
which makes sense. The matrices \(\BasisChange{\mathcal{A}}{\mathcal{A}}\) and \(\BasisChange{\mathcal{B}}{\mathcal{B}}\) take vectors and rewrite them in the same basis, which is to say, they do nothing to the vectors.
The argument above shows that every change of basis matrix is invertible. The converse is also true.
Theorem13.1.1.
An \(n\times n\) matrix is invertible if and only if it is a change of basis matrix.
Proof.
Suppose \(M=\BasisChange{\mathcal{A}}{\mathcal{B}}\) is a change-of-basis matrix. Then
Alternatively, suppose \(M=[C_{1}|C_{2}|\cdots|C_{n}]\) is an invertible \(n\times n\) matrix with columns \(C_{1}\text{,}\) …, \(C_{n}\text{.}\) Let \(\vec c_{i}=[C_{i}]_{\mathcal{E}}\text{.}\) That is, \(\vec c_{i}\) is the vector which comes from interpreting \(C_{i}\) as coordinates with respect to the standard basis.
Since \(M\) is invertible, \(\Rref(M)=I\text{,}\) and so \(\Set{\vec c_1,\ldots,\vec c_n}\) is a linearly independent set of \(n\) vectors. Therefore \(\mathcal{C}=\Set{\vec c_1,\ldots,\vec c_n}\) is a basis for \(\R^{n}\text{.}\) Now, observe
for \(i=1,\ldots, n\text{,}\) and so \(M=\BasisChange{\mathcal{C}}{\mathcal{E}}\) is a change-of-basis matrix.
The proof of the above theorem highlights something interesting. Let \(\mathcal{A}=\Set{\vec a_1,\ldots, \vec a_n}\) be a basis for \(\R^{n}\text{.}\) It is always the case that
has a \(1\) in the \(i\)th position and zeros elsewhere. Now, let \(\mathcal{B}=\Set{\vec b_1,\ldots,\vec b_n}\) be another basis for \(\R^{n}\) and define the matrix \(M=\mat{[\vec a_1]_{\mathcal{B}}&[\vec a_2]_{\mathcal{B}}&\cdots&[\vec a_n]_{\mathcal{B}}}\) to be the matrix with columns \([\vec a_{1}]_{\mathcal{B}},\ldots, [\vec a_{n}]_{\mathcal{B}}\text{.}\) Since multiplying a matrix by \([\vec a_{i}]_{\mathcal{A}}\) will pick out the \(i\)th column, we have that
But, what if we swapped out \(\mathcal{E}\) for a different basis?
Definition13.2.1.Linear Transformation in a Basis.
Let \(\mathcal{T}:\R^{n}\to\R^{n}\) be a linear transformation and let \(\mathcal{B}\) be a basis for \(\R^{n}\text{.}\) The matrix for \(\mathcal{T}\) with respect to \(\mathcal{B}\), notated \([\mathcal{T}]_{\mathcal{B}}\text{,}\) is the \(n\times n\) matrix satisfying
In this case, we say the matrix \([\mathcal{T}]_{\mathcal{B}}\) is the representation of \(\mathcal{T}\) in the \(\mathcal{B}\) basis.
Just like there are many ways to write down coordinates for a vector—one per choice of basis—there are many ways to write down a matrix for a linear transformation. Up to this point, when we’ve said “\(M\) is a matrix for \(\mathcal{T}\)”, what we meant is “\(M=[\mathcal{T}]_{\mathcal{E}}\)”. And, like with vectors, if we talk about a matrix for a linear transformation without specifying the basis, we mean the matrix for the transformation with respect to the standard basis.
Example13.2.2.
Let \(\mathcal{B}=\Set{\vec b_1,\vec b_2}\) where \(\vec b_{1}=\mat{2\\-3}_{\mathcal{E}}\) and \(\vec b_{2}=\mat{5\\-7}_{\mathcal{E}}\) be a basis for \(\R^{2}\) and let \(\mathcal{T}:\R^{2}\to\R^{2}\) be the transformation that stretches in the \(\vec e_{1}\) direction by a factor of \(2\text{.}\) Find \([\mathcal{T}]_{\mathcal{E}}\) and \([\mathcal{T}]_{\mathcal{B}}\text{.}\)
Solution.
Since \(\mathcal{T}\xhat=2\xhat\) and \(\mathcal{T}\yhat=\yhat\text{,}\) We know
We can find \([\mathcal{T}]_{\mathcal{B}}\) in two ways: directly from the definition, or by using change of basis matrices. First, we will work directly from the definition.
To find \([\mathcal{T}]_{\mathcal{B}}\text{,}\) we need to figure out what \(\mathcal{T}\) does to \(\vec b_{1}\) and \(\vec b_{2}\text{.}\) However, since \(\mathcal{T}\) is described in term of \(\xhat\) and \(\yhat\text{,}\) it might be easier to express \(\xhat\) and \(\yhat\) in the \(\mathcal{B}\) basis, and then analyze \(\mathcal{T}\text{.}\)
Just like some bases are better than others to represent particular vectors, some bases are better than others to represent a particular linear transformation.
Example13.3.1.
Let \(\mathcal{B}=\Set{\vec b_1,\vec b_2}\) where \(\vec b_{1}=\mat{2\\-3}_{\mathcal{E}}\) and \(\vec b_{2}=\mat{5\\-7}_{\mathcal{E}}\) be a basis for \(\R^{2}\) and let \(\mathcal{S}:\R^{2}\to\R^{2}\) be the transformation that stretches in the \(\vec b_{1}=2\vec e_{1}-3\vec e_{2}\) direction by a factor of \(2\) and reflects vectors in the \(\vec b_{2}=5\xhat-7\yhat\) direction. Find \([\mathcal{S}]_{\mathcal{E}}\) and \([\mathcal{S}]_{\mathcal{B}}\text{.}\)
Solution.
In this example, \(\mathcal{S}\) is described in terms of the \(\mathcal{B}\) basis. We know
In the example above, \([\mathcal{S}]_{\mathcal{B}}\) is a much nicer matrix than \([\mathcal{S}]_{\mathcal{E}}\text{.}\) However, the two matrices relate to each other. After all,
The matrices \(A\) and \(B\) are called similar matrices, denoted \(A\sim B\), if \(A\) and \(B\) represent the same linear transformation but in possibly different bases. Equivalently, \(A\sim B\) if there is an invertible matrix \(X\) so that
\begin{equation*}
A=XBX^{-1}.
\end{equation*}
The \(X\) in the definition of similar matrices is always a change-of-basis matrix.
When studying a linear transformation, you can pick any basis to represent it in and study the resulting matrix. Different choices of basis will give you different perspectives on the linear transformation. In what’s to follow, we will work to find the “best” basis in which to study a given linear transformation 2
If you cannot wait, the “best” basis will turn out to be the eigen basis (provided it exists).
be bases for \(\R^{2}\text{.}\) Define \(\vec x\in \R^{2}\) by \([\vec x]_{\mathcal{A}}=\mat{1\\-1}\text{.}\)
Find \([\vec x]_{\mathcal{E}}\) and \([\vec x]_{\mathcal{B}}\text{.}\)
Find the change of basis matrices \(\BasisChange{\mathcal{A}}{\mathcal{E}}\text{,}\)\(\BasisChange{\mathcal{E}}{\mathcal{A}}\text{,}\)\(\BasisChange{\mathcal{A}}{\mathcal{B}}\text{,}\) and \(\BasisChange{\mathcal{B}}{\mathcal{A}}\text{.}\)
It thus follows that \([\vec x]_{\mathcal{E}}=\mat{1\\3}\text{.}\)
To find \([\vec x]_{\mathcal{B}}\text{,}\) we must first express the two elements of \(\mathcal{A}\) as linear combinations of the two elements of \(\mathcal{B}\text{.}\) This involves solving two systems of linear equations:
To find \(\BasisChange{\mathcal{E}}{\mathcal{A}}\text{,}\) we simply need to compute \(M^{-1}\text{.}\) Following the explicit inverse formula for \(2\times2\) matrices, we see that
Find the representation of \(\vec b_{1}\text{,}\)\(\vec b_{2}\text{,}\) and \(\vec b_{3}\) in the standard basis.
Find the change of basis matrices \(\BasisChange{\mathcal{E}}{\mathcal{A}}\) and \(\BasisChange{\mathcal{B}}{\mathcal{E}}\text{.}\)
Use \(\BasisChange{\mathcal{E}}{\mathcal{A}}\) and \(\BasisChange{\mathcal{B}}{\mathcal{E}}\) to compute \(\BasisChange{\mathcal{B}}{\mathcal{A}}\text{.}\)
Let \(\mathcal{B}=\Set{\mat{1\\0}_{\mathcal{E}},\mat{1\\1}_{\mathcal{E}}}\text{.}\) For each linear transformation \(\mathcal{T}:\R^{2}\to\R^{2}\) defined below, compute \([\mathcal{T}]_{\mathcal{E}}\) and \([\mathcal{T}]_{\mathcal{B}}\text{.}\)
Let \(\mathcal{T}\) be the transformation that rotates every vector counter clockwise by \(90^{\circ}\text{.}\)
Let \(\mathcal{T}\) be the transformation that projects every vector onto the \(y\)-axis.
Let \(\mathcal{T}\) be the transformation that doubles every vector.
Let \(\mathcal{T}\) be the transformation that reflects every vector over the line \(y=x\text{.}\)
Solution.
\([\mathcal{T}]_{\mathcal{E}}=\mat{0&-1\\1&0}\) and \([\mathcal{T}]_{\mathcal{B}}= \left[\begin{array}{}-1 & -2 \\ 1 & 1\end{array}\right]\)
Note that \(\mathcal{T}\vec e_{1}=\vec e_{2}\) and \(\mathcal{T}\vec e_{2}=\vec e_{1}\text{,}\) so \([\mathcal{T}]_{\mathcal{E}}= \left[\begin{array}{}0 & 1 \\ 1 & 0\end{array}\right]\text{.}\) To compute \([\mathcal{T}]_{\mathcal{B}}\text{,}\) we simply observe that \(\vec b_{1}=\vec e_{1}\) and \(\vec b_{2}=\vec e_{1}+\vec e_{2}\text{.}\) Thus, \(\mathcal{T}\vec b_{1}=\vec e_{2}=-\vec b_{1}+\vec b_{2}\) and \(\mathcal{T}\vec b_{2}=\vec e_{1}+\vec e_{2}=\vec b_{2}\text{.}\) Therefore, \([\mathcal{T}]_{\mathcal{B}}= \left[\begin{array}{}-1 & 0 \\ 1 & 1\end{array}\right]\text{.}\)
4.
For each statement below, determine whether it is true or false. Justify your answer.
Any invertible \(n \times n\) matrix can be viewed as a change of basis matrix.
Any \(n \times n\) matrix is similar to itself.
Let \(A\) be an \(m \times n\) matrix. If \(m \neq n\text{,}\) then there is no matrix that is similar to \(A\text{.}\)
Any invertible \(n \times n\) matrix \(A\) is similar to \(A^{-1}\) since \(AA^{-1}=I\text{.}\)
Solution.
True. A square matrix \(M\) is invertible if and only if it is a change of basis matrix.
True. Any square matrix \(M\) satsifies \(M=IMI^{-1}\text{.}\)
True. The definition of similarity requires the existence of an invertible matrix \(P\text{,}\) i.e. a square matrix, and a matrix \(B\) such that \(B=PAP^{-1}\text{.}\) If \(A\) is not a square matrix, then either \(PA\) or \(AP^{-1}\) is not defined, so such a matrix \(P\) cannot exist.
False. For example, \(A=\left[\begin{array}{}2 & 0 \\ 0 & -1\end{array}\right]\) is not similar to its inverse.