but that second equation, \(0x+0y=0\text{,}\) is funny. It is always true, no matter the choice of \(x\) and \(y\text{.}\) It adds no new information! In retrospect, it might be obvious that both equations from System (B.0.7) contain the same information making one equation redundant.
System (B.0.7) is an example of an underdetermined system of equations, meaning there is not enough information to uniquely determine the value of each variable. Its solution set is a line, which we can find by graphing.
FigureB.0.29.
From this picture, we could describe the complete solution to System (B.0.7) in vector form by
\begin{equation*}
\vec x = t\mat{2\\1}.
\end{equation*}
But, what about a more complicated system? The system
but this is much harder to find by graphing. Fortunately, we won’t have to graph. Row reduction, combined with the notion of free variables, will provide a solution.
SectionB.1Reduced Row Echelon Form
Before we tackle complete solutions for underdetermined systems, we need to talk about reduced row echelon form 1
Reduced row echelon form is alternatively called row reduced echelon form; whether you say “reduced row” or “row reduced” makes no difference to the math!
, which is abbreviated rref. The reduced row echelon form of a matrix is the simplest (in terms of reading off solutions) form a matrix can be turned into via elementary row operations.
DefinitionB.1.1.Reduced Row Echelon Form (RREF).
A matrix \(X\) is in reduced row echelon form if the following conditions hold:
The first non-zero entry in every row is a \(1\text{;}\) these entries are called pivots or leading ones.
Above and below each leading one are zeros.
The leading ones form an echelon (staircase) pattern. That is, if row \(i\) has a leading one, every leading one appearing in row \(j>i\) also appears to the right of the leading one in row \(i\text{.}\)
All rows of zeros occur at the bottom of \(X\text{.}\)
Columns of a reduced row echelon form matrix that contain pivots are called pivot columns 2
If a matrix is augmented, we usually do not refer to the augmented column as a pivot column, even if it contains a pivot.
.
ExampleB.1.2.
Which of the follow matrices are in reduced row echelon form? For those that are, identify which columns are pivot columns. For those that are not, what condition(s) fail?
\(B\) is not in reduced row echelon form for two reasons: (i) the first non-zero entry in the third row is not a \(1\text{,}\) and (ii) the entry below the leading one in the second row is not zero.
\(D\) is not in reduced row echelon form for two reasons: (i) the entries above the leading one in the third row are not all zeros, and (ii) the leading one in the second row appears to the left of the leading one in the first row.
which was suitable for solving the system. However, \(X'\) is not in reduced row echelon form (the leading non-zero entries must all be ones). We can further row reduce \(X'\) to
\(X''\) is the reduced row echelon form of \(X\text{,}\) and reading off the solution to the original system from \(X''\) is as simple as can be!
Every matrix, \(M\text{,}\) has a unique reduced row echelon form, written \(\Rref(M)\text{,}\) which can be obtained from \(M\) by applying elementary row operations. There are many ways to compute the reduced row echelon form of a matrix, but the following algorithm always works.
DefinitionB.1.3.Row Reduction Algorithm.
Let \(M\) be a matrix.
If \(M\) takes the form \(M=[\vec 0|M']\) (that is, its first column is all zeros), apply the algorithm to \(M'\text{.}\)
If not, perform a row-swap (if needed) so the upper-left entry of \(M\) is non-zero.
Let \(\alpha\) be the upper-left entry of \(M\text{.}\) Perform the row operation \(\text{row}_{1}\mapsto \tfrac{1}{\alpha}\text{row}_{1}\text{.}\) The upper-left entry of \(M\) is now \(1\) and is called a pivot.
Use row operations of the form \(\text{row}_{i}\mapsto \text{row}_{i}+\beta\,\text{row}_{1}\) to zero every entry below the pivot.
where \(M'\) is a submatrix of \(M\text{.}\) Apply the algorithm to \(M'\text{.}\)
The resulting matrix is in pre-reduced row echelon form. To put the matrix in reduced row echelon form, additionally apply step 6.
6.
Use the row operations of the form \(\text{row}_{i}\mapsto \text{row}_{i}+\beta\,\text{row}_{j}\) to zero above each pivot.
Though there might be more efficient ways, and you might encounter ugly fractions, the row reduction algorithm will always convert a matrix to its reduced row echelon form.
All matrices, whether augmented or not, have a reduced row echelon form. Correctly applying the row reduction algorithm takes practice, but being able to row reduce a matrix is the analogue of “knowing your multiplication tables” for linear algebra.
From the reduced row echelon form, we’re left with the equation \(x+3y=2\text{,}\) which isn’t exactly a solution. Effectively, the original system had only one equation’s worth of information, so we cannot solve for both \(x\) and \(y\) based on the original system. To get ourselves out of this pickle, we will use a notational trick: introduce the arbitrary equation \(y=t\) 1
This equation is called arbitrary because it introduces no new information about the original variables. The restrictions on \(x\) and \(y\) aren’t changed by introducing the fact \(y=t\text{.}\)
. Now, because we’ve already done row-reduction, we see
Notice that \(t\) here stands for an arbitrary real number. Any choice of \(t\) produces a valid solution to the original system (go ahead, pick some values for \(t\) and see what happens). We call \(t\) a parameter and \(y\) a free variable 2
We call \(y\)free because we may pick it to be anything we want and still produce a solution to the system.
Though you can usually make many choices about which variables are free variables, one choice always works: pick all variables corresponding to non-pivot columns to be free variables. For this reason, we refer to non-pivot non-augmented columns of a row-reduced matrix as free variable columns.
ExampleB.2.1.
Use row reduction to find the complete solution to \(\left\{\begin{array}{rcrcrl}x&+&y&+&z&=1\\&&y&-&z&=2\end{array}\right.\)
Solution.
The corresponding augmented matrix for the system is
The third column of \(\Rref(A)\) is a free variable column, so we introduce the arbitrary equation \(z=t\) and solve the system in terms of \(t\text{:}\)
which is already in reduced row echelon form. It’s third column is the only pivot column, making columns \(1\) and \(2\) free variable columns (remember, we don’t count augmented columns as free variable columns). Thus, we introduce two arbitrary equations, \(x=t\) and \(y=s\text{,}\) and solve the new system
Using row reduction and free variables, we can find complete solutions to very complicated systems. The process is straight-forward enough that even a computer can do it! 3
Computers usually don’t follow the algorithm outlined here because they have to deal with rounding error. But, there is a modification of the row reduction algorithm called row reduction with partial pivoting which solves some issues with rounding error.
ExampleB.2.2.
Consider the system of equations in the variables \(x\text{,}\)\(y\text{,}\)\(z\text{,}\) and \(w\text{:}\)
\begin{equation*}
\Set{\vec x\in\R^4\given \vec x = t\mat{1\\0\\0\\0}+s\mat{0\\-2\\1\\0}+\mat{0\\-1\\0\\1} \text{ for some }t,s\in\R}.
\end{equation*}
SectionB.3Free Variables & Inconsistent Systems
If you need a free variable/parameter to describe the complete solution to a system of linear equations, the system necessarily has an infinite number of solutions—one coming from every choice of value for your free variable/parameter. However, one still needs to be careful when deciding from an augmented matrix whether a system of linear equations has an infinite number of solutions.
Consider the augmented matrices \(A\) and \(B\text{,}\) which are given in reduced row echelon form.
Both matrices lack a pivot in their second column. However, \(A\) corresponds to a system with an infinite solution set, while \(B\) corresponds to an inconsistent system with an empty solution set. We can debate whether it is appropriate to say that \(B\) has a free variable column 1
On the one hand, the second column fits the description. On the other hand, you cannot make any choices when picking a solution, since there are no solutions.
, but one thing is clear: when evaluating the number of solutions to a system, you must pay attention to whether or not the system is consistent.
Putting everything together, we can fully classify the number of solutions to a system of linear equations based on pivots/free variables.
Consistent
Pivots
Number of Solutions
False
At least one column doesn’t have a pivot
0
True
Every column has a pivot
1
True
At least one column doesn’t have a pivot
Infinitely many
This information is so important, we will also codify it in a theorem.
TheoremB.3.1.
A system of linear equations has either \(0\) solutions, \(1\) solution, or infinitely many solutions.
The only values of \(x\) and \(y\) that satisfy both equations is \((x,y)=(2,1)\text{.}\) However, each row, viewed in isolation, specifies a line in \(\R^{2}\text{.}\) Call the line coming from the first row \(\ell_{1}\) and the line coming from the second row \(\ell_{2}\text{.}\)
FigureB.4.1.
These two lines intersect exactly at the point \(\mat{2\\1}\text{.}\) And, of course they should. By definition, a solution to a system of equations satisfies all equations. In other words, a solution to System (B.4.1) is a point that lies in both \(\ell_{1}\) and \(\ell_{2}\text{.}\) In other words, solutions lie in \(\ell_{1}\cap \ell_{2}\text{.}\)
TakeawayB.4.2.
Geometrically, a solution to a system of equations is the intersection of objects specified by the individual equations.
This perspective sheds some light on inconsistent systems. The system
is inconsistent. And, when we graph the lines specified by the rows, we see that they are parallel and never intersect. Thus, the solution set is empty.
SectionB.5Planes & Hyperplanes
Consider the solution set to a single linear equation viewed in isolation. For example, in the three-variable case, we might consider
\begin{equation*}
x+2y-z=3.
\end{equation*}
The solution set to this equation is a plane. Why? For starters, writing down the complete solution involves picking two free variables. Suppose we pick \(y=t\) and \(z=s\text{.}\) Then, before we even do a calculation, we know the complete solution will be described in vector form by
where \(\vec d_{1}\text{,}\)\(\vec d_{2}\text{,}\) and \(\vec p\) come from doing the usual computations. But, that is vector form of a plane!
In general, a single equation in \(n\) variables requires \(n-1\) free variables to describe its complete solution. The only exception is the trivial equation, \(0x_{1}+\cdots +0x_{n}=0\text{,}\) which requires \(n\) free variables. For the sake of brevity, from now on we will assume that a linear equation in \(n\) variables means a non-trivial linear equation in \(n\) variables.
Applying this knowledge, we can construct a table for systems consisting of a single linear equation.
Number of Variables
Number of Free Variables
Complete Solution
2
1
Line in \(\R^{2}\)
3
2
Plane in \(\R^{3}\)
4
3
Volume in \(\R^{4}\)
Notice that the dimension of the solution set (a line being one dimensional, a plane being two dimensional, and a volume being three dimensional) is always one less than the dimension of the ambient space (\(\R^{2}\text{,}\)\(\R^{3}\text{,}\)\(\R^{4}\)) 1
Another way to describe these sets would be to say that they have co-dimension 1.
. Such sets are called hyperplanes because they are flat and plane-like. However, unlike a plane, the dimension of a hyperplane need not be two.
With our newfound geometric intuition, we can understand solutions to systems of linear equations in a different way. The solution set to a system of linear equations of two variables must be the result of intersecting lines. Therefore, the only options are: a point, a line, or the empty set. The solution to a system of linear equations of three variables is similarly restricted. It can only be: a point, a line, a plane, or the empty set.
FigureB.5.1.
FigureB.5.2.
In higher dimensions, the story is the same: solution sets are formed by intersecting hyperplanes and we can use algebra to precisely describe these sets of intersection.
ExercisesB.6Exercises
1.
Find the complete solution to the following systems.
The third and fourth column of \(\Rref(X)\) are free variable columns, so we introduce the arbitrary equations \(z=t\) and \(w=s\) and solve the following system in terms of \(t\) and \(s\text{:}\)
The fourth column of \(\Rref(X)\) is a free variable column, so we introduce the arbitrary equation \(w=t\) and solve the following system in terms of \(t\text{:}\)
The third column of \(\Rref(X)\) is a free variable column, so we introduce the arbitrary equation \(z=t\) and solve the following system in terms of \(t\text{:}\)
The third column of \(\Rref(X)\) is a free variable column, so we introduce the arbitrary equation \(z=t\) and solve the following system in terms of \(t\text{:}\)
For each system of linear equations given below: (i) write down its augmented matrix, (ii) use row reduction algorithm to determine if it is consistent or not, (iii) for each consistent system, give the complete solution.
Let \(\vec v_{1}=\mat{1\\1\\-2\\4}\text{,}\)\(\vec v_{2}=\mat{1\\4\\0\\2}\) and \(\vec v_{3}=\mat{-2\\-2\\4\\-8}\text{.}\)
Set up and solve a system of linear equations whose solution will determine if the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\) and \(\vec v_{3}\) are linearly independent.
Let \(\vec v_{1}=\mat{1\\2\\3}\text{,}\)\(\vec v_{2}=\mat{-2\\1\\0}\) and \(\vec v_{3}=\mat{2\\7\\1}\text{.}\)
Set up and solve a system of linear equations whose solution will determine if the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\) and \(\vec v_{3}\) span \(\R^{3}\text{.}\)
Let \(\ell_{1}\) and \(\ell_{2}\) be described in vector form by
In particular, \((x, y, z)=(2, 0, 1)\) is a non-trivial solution to this system, so the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\) and \(\vec v_{3}\) are linearly dependent.
By definition, the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\text{,}\) and \(\vec v_{3}\) span \(\R^{3}\) if every vector can be written as a linear combination of \(\vec v_{1}\text{,}\)\(\vec v_{2}\text{,}\) and \(\vec v_{3}\text{.}\) In other words, the equation
Row reducing, we notice that every column is a pivot column and so the system is always consistent. Therefore, \(\Span\Set{\vec v_1,\vec v_2,\vec v_3}=\R^{3}\text{.}\)
The lines \(\ell_{1}\) and \(\ell_{2}\) intersect when their \(x\) and \(y\)-coordinates. We first set the parameter variable of \(\ell_{1}\) to \(t\) and the parameter variable of \(\ell_{2}\) to \(s\text{.}\) Then, equating the coordinates gives the system of linear equations
Since \(\vec x=\mat{9/5\\17/5}\) when \(t=4/5\) (or \(s=-3/5\)), the intersection of \(\ell_{1}\) and \(\ell_{2}\) is the point \(\mat{9/5\\17/5}\text{.}\)
The planes \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\) intersect when their coordinates are equal. Relabeling the parameter variables for \(\mathcal{P}_{2}\) as \(q\) and \(r\) and equating both vector forms, we get the following system of linear equations:
Thus there are an infinite number of points in \(\mathcal{P}_{1}\cap \mathcal{P}_{2}\text{.}\)
To find these points, we substitute \(q=0\) and \(r=u\) into the vector form of \(\mathcal{P}_{2}\text{.}\) This shows us that \(\mathcal{P}_{1}\cap \mathcal{P}_{2}\) can be expressed in vector form by
Presented below some students’ arguments for question B.6.3. Evaluate whether their reasoning is totally correct, mostly correct, or incorrect. If their reasoning is not totally correct, point out what mistake(s) they made and how they might be fixed.
Since \((x, y, z)=(0, 0, 0)\) is a solution to the equation, the equation has the trivial solution. Therefore, the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\) and \(\vec v_{3}\) are linearly independent.
Notice that \((x, y, z)=(-2, 0, -1)\) is a solution to the equation. Since \(y=0\) in this solution, it is a trivial solution and therefore the vectors \(\vec v_{1}\text{,}\)\(\vec v_{2}\) and \(\vec v_{3}\) are linearly independent.
(c) i.
The lines \(\ell_{1}\) and \(\ell_{2}\) intersect when their \(x\) and \(y\)-coordinates are equal. Equating \(x\) and \(y\)-coordinates gives
So, \(\vec x=\mat{1/2\\-1/2\\0}\) is a point on \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\text{.}\) Therefore the planes \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\) intersect.
(d) ii.
Notice that \(\vec x=\mat{1\\0\\0}\) is a point in \(\mathcal{P}_{2}\text{,}\) but this point is not in \(\mathcal{P}_{1}\text{.}\) Therefore the planes do not intersect.
Solution.
The reasoning is incorrect. The solution \((x, y, z)=(0, 0, 0)\) is the trivial solution to the vector equation, and it is always a solution to the homogeneous equation
no matter what \(\vec v_{1}, \dots, \vec v_{k}\) are.
To determine if a set of vectors is linearly independent, we need to find out whether the trivial solution is the only solution to the vector equation. That is, there does not exist any non-trivial solution to the vector equation.
The reasoning is incorrect. A trivial solution is the solution where all the variables equal zero, so the solution \((x, y, z)=(-2, 0, -1)\) is not a trivial solution.
The reasoning is incorrect. When equating coordinates of two different vector forms, the parameter variables needs to be set to different letters.
Here we have set the parameter \(t\) in the vector form of \(\ell_{2}\) to \(s\text{.}\)
The reasoning is incorrect. The solution \(\mat{4/5\\-3/5}\) to the system of linear equations gives the value of \(t\) and \(s\) at the intersection. To find the intersection of \(\ell_{1}\) and \(\ell_{2}\text{,}\) the value \(t=4/5\) or \(s=-3/5\) needs to be plugged into the vector form of \(\ell_{1}\) or \(\ell_{2}\text{.}\)
The reasoning is correct. Since \(\vec x=\mat{1/2\\-1/2\\0}\) is a point on both \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\text{,}\) it is in the intersection of \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\text{,}\) so the planes \(\mathcal{P}_{1}\) and \(\mathcal{P}_{2}\) intersect. However, we can not determine if the intersection is a line or a plane based on only one point, so we need to set up and solve an appropriate system of linear equations.
The reasoning is incorrect. Finding one point that is in \(\mathcal{P}_{2}\) but not in \(\mathcal{P}_{1}\) shows that \(\mathcal{P}_{1}\) does not intersect \(\mathcal{P}_{2}\)at that point, but does not rule out the possibility that \(\mathcal{P}_{1}\) intersects \(\mathcal{P}_{2}\) at a different point.