What the solution set to a system of equations is, and what it means for a system of equations to be consistent or inconsistent.
How augmented matrices can be used to solve systems of linear equations.
How to apply row reduction to find a unique solution to a system of linear equations and to determine if a system of linear equations is consistent or inconsistent.
An equation encodes a relationship between quantities. For example, writing
\begin{equation*}
\underbrace{\text{Slices of cake}}_{C} = \underbrace{\text{Slices you ate}}_{M} + \underbrace{\text{Slices your brother ate}}_{B}
\end{equation*}
specifies a precise relationship between the quantities \(C\text{,}\)\(M\text{,}\) and \(B\text{.}\) Without more information, \(C\text{,}\)\(M\text{,}\) and \(B\) could be almost anything. As such, we call \(C\text{,}\)\(M\text{,}\) and \(B\)variables or unknowns. However, the relationship between them is precisely defined.
Additional relationships give rise to additional equations, which we express concisely as a system of equations, that is, a list of equations. For example, suppose you know the cake had six pieces and your brother ate twice as many pieces as you. We might now write the system
\begin{align*}
C&= M + B \\
B&= 2M \\
C&= 6
\end{align*}
which should be interpreted as: “the relationship \(C=M+B\) holds and the relationship \(B=2M\) holds and the relationship \(C=6\) holds.” All this information, taken together, is enough to deduce the unknowns: \(M=2\text{,}\)\(B=4\text{,}\) and \(C=6\text{.}\)
Systems of equations naturally appear in linear algebra through vector equations. Suppose \(\vec u=\mat{1\\2}\text{,}\)\(\vec v=\mat{2\\3}\text{,}\) and \(\vec w=\mat{1\\1}\text{.}\) You might wonder if \(\vec w\) was a linear combination of \(\vec u\) and \(\vec v\text{.}\) The answer is yes if and only if the vector equation
\begin{equation*}
\vec w=a\vec u+b\vec v
\end{equation*}
has a solution for some \(a\) and \(b\text{.}\) Written in coordinates, this equation is equivalent to
Every vector equation, by way of coordinates, corresponds to a system of equations. And, fortunately for us, there is an algorithm to find all solutions to these systems 5
Saying there is an algorithm for “\(X\)” means that there is a specific set of (non-random) rules that always accomplishes “\(X\)”. As a consequence, doing “\(X\)” never requires special insight. For example, there is an algorithm for multiplying numbers, but there is not an algorithm for factoring polynomials of degree 5 or greater.
.
SectionA.1Systems of Linear Equations
There’s no guarantee that a general equation, like \(x^{4}-e^{x}+7=0\text{,}\) has a solution, and it might be impossible to decide if an arbitrary equation has a solution, let alone what the solutions are! 1
Fermat’s Last Theorem famously claimed that \(a^{n}+b^{n}=c^{n}\) has no positive integer solutions for \(n\geq 3\text{.}\) However, it took 350 years of human ingenuity before anyone was able rigorously back up the claim.
However, for linear equations and systems of linear equations we can always tell whether there is a solution and what the solution(s) are.
DefinitionA.1.1.Linear Equation.
A linear equation in the variables \(x_{1},\ldots,x_{n}\) is one that can be expressed as
for constants \(a_{1},\ldots, a_{n}\) and \(c\text{.}\) A system of linear equations is a system of equations consisting of one or more linear equations.
Every vector equation corresponds to an equivalent system of linear equations and vice versa, where equivalent means “expresses the same relationships between variables”.
ExampleA.1.2.
Write down the vector equation corresponding to the system of linear equations \(\left\{\begin{array}{rcrcrl}2x&+&3y&+&z&=2\\&&y&-&z&=-1\end{array}\right.\) and the system of linear equations corresponding to the vector equation \(t\vec w+\vec u=r\vec v\) where \(\vec w=\mat{1\\-1}\text{,}\)\(\vec u=\mat{2\\3}\text{,}\) and \(\vec v=\mat{4\\4}\text{.}\)
Solution.
The system \(\left\{\begin{array}{rcrcrl}2x&+&3y&+&z&=2\\0x&+&y&-&z&=-1\end{array}\right.\) corresponds to the vector equation
As for the vector equation \(t\vec w+\vec u=r\vec v\text{,}\) rewriting each vector in coordinates gives us a corresponding system of linear equations:
\begin{align*}
t\vec w + \vec u = r\vec v \rightarrow t\mat{1\\-1}+ \mat{2\\3}&= r\mat{4\\4}\\
\mat{t+2\\-t+3}&= \mat{4r\\4r}\rightarrow \left\{
\begin{array}{crcrl}
-&4r&+&t&=-2\\-&4r&-&t&=-3
\end{array}\right..
\end{align*}
TakeawayA.1.3.
Every vector equation corresponds to a system of linear equations and every system of linear equations corresponds to a vector equation.
SectionA.2Solution Sets
Before looking at how to solve systems of linear equations, let’s agree on some terminology.
A solution to an equation is a particular choice of values for the variables that satisfy (i.e. make true) the equation. For example
\begin{equation}
x+y=4\tag{A.2.1}
\end{equation}
has a solution \(x=y=2\text{.}\) However, \(x=y=2\) is just one of many possible solutions; we also have \(x=4\) and \(y=0\) or \(x=-2\) and \(y=6\text{.}\) The solution set, also called the complete solution, to an equation (or system of equations) is the set of all possible solutions. For example, the solution set to Equation (A.2.1) is \(S=\Set{(x,y)\given y=4-x}\text{.}\) In this case, \(S\) contains infinitely many solutions, including \((x,y)=(2,2)\text{,}\) the first solution we found.
Solution sets look a lot like sets of vectors: the set \(S=\Set{(x,y)\given y=4-x}\) could be thought of as a subset of \(\R^{2}\) where we identify a solution \(x=a\) and \(y=b\) with the column vector \(\mat{a\\b}\text{.}\) Drawing \(S\) as a subset of \(\R^{2}\text{,}\) we see a familiar picture.
FigureA.2.1.
It’s the graph of the line given in \(y=mx+b\) form by \(y=-x+4\text{.}\) In other words, via solution sets, equations and systems of equations can represent geometric objects.
SectionA.3Consistent & Inconsistent Systems
Consider the following equations (as separate equations, not as a system):
\(S_{x}=\Set{0}\) consists of a single number. \(S_{y}=\Set{2,-2}\) consists of two numbers, and \(S_{z}=\Set{}\) consists of no numbers 1
We’re not allowing complex numbers at the moment.
. In this case, we would call the first two equations consistent and the third equation inconsistent.
DefinitionA.3.1.Consistent & Inconsistent.
An equation or system of equations is called consistent if it has at least one solution. That is, its solution set is non-empty. Otherwise, an equation or system of equations is called inconsistent.
Why the word consistent? This comes from the term logically consistent which means “able to be true”. An equation is an assertion that the left hand side equals the right hand side. If that can never happen, the assertion is not logically consistent.
This terminology becomes more clear with systems. Consider the system
From the first equation, we deduce \(y=x\text{.}\) From the second equation, we deduce \(x=1+y\text{.}\) Since \(x=x\text{,}\) we know that \(y=x=1+y\) and therefore \(y=1+y\text{.}\) However, this is never true! There is a logical inconsistency between the two equations. In isolation they’re fine, but taken together they’re not.
SectionA.4Equivalent Systems
Two systems of equations are logically equivalent if they express the same relationships between their variables. For example, the equations \(x=2y\) and \(\tfrac{1}{2}x=y\) express the exact same relationship between the variables \(x\) and \(y\text{.}\) This can be formalized using solution sets.
DefinitionA.4.1.Equivalent Systems.
Two equations or systems of equations are called equivalent if they have the same solution sets.
Again, \(x=2y\) and \(\tfrac{1}{2}x=y\) both have the same solution set (a line through the origin of slope \(\tfrac{1}{2}\)), and so they are equivalent.
Philosophical note: the process of “doing algebra” can be viewed as the process of manipulating equations/systems into easier to understand equivalent equations/systems. When you’re asked to algebraically solve \(x^{2}-4=0\text{.}\) You might first factor to get the equivalent equation \((x-2)(x+2)=0\text{.}\) Then, since non-zero numbers cannot multiply to give zero, we know \(x-2=0\) or \(x+2=0\text{,}\) which in turn is equivalent to \(x=\pm2\text{.}\) It’s always been about equivalent systems! 1
Technically, up to this point we’ve been talking about conjunctive systems, which means that a solution must hold for all equations of a system. The system \(x=\pm 2\) is a disjunctive system, which means a solution only needs to hold for one of the equations (\(x=2\) or \(x=-2\)), but the idea is the same.
The most general way to solve any system is by substitution. For System (A.5.1), we could solve the first equation for \(t\text{,}\) substitute the result in the remaining equations, solve the next equation for \(s\text{,}\) etc.. However, because System (A.5.1) is a linear system, there’s an alternate method: row reduction 1
Row reduction is sometimes referred to as Gaussian elimination, Gauss-Jordan elimination, or just elimination; though there are subtle differences between Gaussian and Gauss-Jordan elimination, they aren’t important, and we’ll refer to all similar methods as row reduction.
.
Study the following manipulations of System (A.5.1) and convince yourself that each operation produces a system equivalent to the one it came from.
From the final system, System (A.5.4), it’s easy to see that \(r=3\text{.}\) From there, we can substitute \(r=3\) into the second row of System (A.5.4) to find \(s=-4\) and we can substitute both \(r\) and \(s\) into the first row of System (A.5.4) to find \(t=-1\text{.}\)
By adding and subtracting rows, we “reduced” the number of variables from some equations until they were easy to solve. As an added benefit, every system along the way to System (A.5.4) was nicely organized and formatted. In fact, the systems are so well organized that we can save time by not writing the variables and keeping track of the numbers in an augmented matrix 2
A matrix is a box of numbers. An augmented matrix is a matrix with extra information associated with it.
We call the matrix an augmented matrix to stress that it contains two types of information: the coefficients of the variables \(t\text{,}\)\(s\text{,}\) and \(r\) and the constants on the right hand side of the equations. An (optional) vertical line separates the two types of numbers.
Now, the process of row reduction looks like this:
The operations are identical, but we write augmented matrices instead of equations.
TakeawayA.5.1.
Augmented matrices are a notational tool that makes the process of doing row reduction more efficient.
SectionA.6The Rules of Row Reduction
So far, we’ve been able to row reduce systems by adding a multiple of one row to another 1
Technically, we subtracted, but that’s just adding a negative.
, but to fully solve any system, we need additional operations 2
If you’re clever, you can think up alternatives to the elementary row operations that work just as well, but there’s good reason to favor the three elementary row operations. We’ll see them when discussing matrix decompositions.
.
DefinitionA.6.1.Elementary Row Operations.
The three elementary row operations, which can be performed on a matrix or system of equations, are
swapping two rows (written \(\text{row}_{i}\leftrightarrow \text{row}_{j}\)),
multiplying a row by a non-zero scalar (written \(\text{row}_{i}\mapsto k\,\text{row}_{i}\)), and
adding a multiple of one row to another (written \(\text{row}_{i}\mapsto \text{row}_{i}+ k\,\text{row}_{j}\)).
Notice that each elementary row operation can be undone. For example, if you perform \(\text{row}_{i}\mapsto k\,\text{row}_{i}\text{,}\) you can undo it with \(\text{row}_{i}\mapsto \tfrac{1}{k}\,\text{row}_{i}\text{.}\) Therefore, applying an elementary row operation to a system is guaranteed to produce an equivalent system.
The strategy for solving a system is now summarized as:
Rewrite the system as an augmented matrix.
Use elementary row operations to zero-out the lower triangle of the augmented matrix.
Convert the matrix back to a system of equations.
Read off the solution (substituting when necessary).
The third row reveals that \(c=4\text{;}\) substituting into the second row, we find \(b=-4\text{.}\) Now we can substitute \(b=-4\) and \(c=4\) into the first row and we obtain \(a=5\text{.}\)
Thus, the solution is \(\mat{a\\b\\c}=\mat{5\\-4\\4}\text{.}\) Since this is the only solution to the system, the solution set is \(\Set{\mat{5\\-4\\4}}\text{.}\)
Our equivalent system reveals \(r=-2\text{,}\) which we can substitute back into the first and second rows to find that \(t=11\) and \(s=-9\text{.}\)
As a vector, the solution is \(\mat{t\\s\\r}=\mat{11\\-9\\-2}\) and so the solution set is \(\Set{\mat{11\\-9\\-2}}\text{.}\)
In the examples so far, we’ve stopped row reducing when the equations became simple enough to solve by inspection. However, we could continue row reducing until the system is as simple as possible.
Notice that we solved this system using a combination of row reduction and substitution in a previous example. This time, let us use only row reduction.
But, the last equation says \(0x+0y=3\text{,}\) which is not true for any choice of \(x\) and \(y\text{!}\) Thus, we see applying row reduction to an inconsistent system reveals its inconsistency.
The equation \(0x + 0y + 0z = 14\) is never true and so the system is inconsistent. Since there are no values of \(x\text{,}\)\(y\text{,}\) and \(z\) that satisfy the system, the solution set is \(\Set{}\text{,}\) the empty set.
ExercisesA.7Exercises
1.
For each equation given below, determine if it is a linear equation. If not, explain what makes it nonlinear.
Not a linear equation because of the \(\cos(y)\) term.
Not a linear equation because of the \(3xy\) term.
Linear equation.
Not a linear equation because of the \(\frac{x}{y}\) term. Note that it is almost equivalent to the equation \(x=y\text{,}\) but they are not equivalent because \(x = 0,y = 0\) is a solution to the latter equation but not the former.
2.
Convert each vector equation given below to a system of linear equations.
There are vectors \(\vec{b}\) that makes the system consistent. For instance, any vector \(\vec b = \vec b=\mat{t \\ 2t}\) where \(t\in\R\) makes the system consistent. Since there are infinitely many real numbers, we conclude that there are infinitely many vectors \(\vec b\) that makes the system consistent.
If \(\vec b=\mat{5 \\ 14}\text{,}\) then the vector equation becomes
There are vectors \(\vec b\) that makes the system inconsistent. For instance, \(\mat{10 \\ 24}\) is is such a vector. In general, any vector \(\vec b\) with \(\vec b=\mat{5t \\ 12t}\) where \(t\in\R\) (\(t\ne 0\)) makes the system inconsistent. Since there are infinitely many real numbers, we conclude that there are infinitely many vectors \(\vec b\) that makes the system inconsistent.
5.
On Kokoro’s farm, there is a cage with \(35\) animals, some of which are chickens and some of which are rabbits. Kokoro counted the total number of legs in the cage and found that there were \(94\) legs in all (notably, each chicken has exactly two legs and each rabbit has four legs). Kokoro decides to use this information to figure out how many chickens and how many rabbits there are 3
This problem based on a classical Chinese problem from the ancient Chinese treatise Mathematical Classic of Master Sun (or Sunzi Suanjing) written during 3rd to 5th centuries A.D.
.
Set up a system of linear equations that you could solve to answer Kokoro’s question.
Is the system consistent? If so, answer Kokoro’s question.
Kokoro wants to set up three other cages. For each described cage below, explain using complete English sentences, whether such a configuration is possible. Justify your answers using linear algebra.
Kokoro wants to set up a cage with cats and dogs (notably, each cat has exactly four legs and each dog has four legs) so that there are \(35\) animals in total, and the total number of legs is \(94\text{.}\)
Kokoro wants to set up a cage with cats and dogs so that there are \(35\) animals in total, and the total number of legs is \(140\text{.}\)
Kokoro wants to set up a cage with chickens and rabbits so that there are \(42\) animals in total, and the total number of legs is \(77\text{.}\)
Solution.
Let \(x\) be the number of chickens, and let \(y\) be the number of rabbits. Using the information given in the problem, we have
This shows that the system is consistent. The solution to this system is \(x=23, y=12\text{.}\) Thus, there are 23 chickens and 12 rabbits in the farm.
Before discussing each configuration, we point out that a configuration is possible if there exists a natural number solution to the system of linear equations associated with the configuration.
For the first configuration, let \(x\) be the number of cats, and let \(y\) be the number of dogs. Using the information given in the problem, we have
This system is inconsistent, which means there’s no solution to this system. Therefore, Kokoro’s first configuration is not possible.
For the second configuration, let \(x\) be the number of cats, and let \(y\) be the number of dogs. Using the information given in the problem, we have
Take \(t=1\text{,}\) and we get a natural number solution \(x=34,y=1\text{.}\) (In fact, there is more than one natural number solution.) Therefore, Kokoro’s second configuration is possible.
For the third configuration, let \(x\) be the number of chickens, and let \(y\) be the number of rabbits. Using the information given in the problem, we have
This system is consistent and the unique solution is \(x = \frac{91}{2}, y = -\frac{7}{2}\) However, there cannot be 91/2 of a chicken, so Kokoro’s third configuration is not possible.
6.
For each statement below, determine whether it is true or false. Justify your answer.
A system of linear equations of 4 variables with 3 equations is always consistent.
Any system of linear equation with \(0x_{1}+0x_{2}+\cdots+0x_{n}=0\) being one of the equations must be consistent.
There are \(m,c\in \R\) so that the \(y\)-axis is the solution set to the equation \(y=mx+c\text{.}\)
There are \(m,c\in \R\) so that the \(x\)-axis is the solution set to the equation \(y=mx+c\text{.}\)
There are \(m_{1},m_{2},c\in \R\) so that the \(x\)-axis (in \(\R^{3}\)) is the solution set to the equation \(z=m_{1}x+m_{2}y+c\text{.}\)
A system of exactly one equation can have an empty solution set.
False. Assume the \(y\)-axis can be represented as the complete solution to \(y=mx+c\) for some \(m,c\text{.}\) Since \((x,y)=(0 ,0)\) and \((x,y)=(0,1)\) are both on the \(y\) axis, we know \(0=0m+c\) and \(1=0m+c\text{.}\) This gives \(0=1\text{,}\) which is false. Therefore, there’s no \(m,c\in \R\) so that the \(y\)-axis is the solution set to the equation \(y = mx + c\text{.}\)
True. Take \(m=0\text{,}\)\(c=0\text{.}\) The equation then becomes \(y=0\text{.}\) A complete solution to this equation is given by \(\mat{t \\ 0}\) (\(t\in \R\)), which is exactly the \(x\)-axis.
False. The \(x\)-axis in \(\R^{3}\) can be described as \(\Set{\mat{x \\ 0 \\ 0}\in\R^3: x\in\R}\text{.}\) Assume the \(x\)-axis can be represented as the complete solution to \(z=m_{1}x+m_{2}y+c\) for some \(m_{1},m_{2},c\text{.}\) Since \((x,y,z)=(0,0,0)\) is on the \(x\) axis, we know \(c=0\text{.}\) Since \((x,y,z)=(1,0,0)\) is on the \(x\) axis, we know that \(m_{1}=0\text{.}\) The equation then becomes \(z=m_{2}y\text{.}\) However, for each choice of \(m_{2}\text{,}\)\(x=0 , y=1, z=m_{2}\) is a solution to the system which does not lie in the \(x\)-axis. Therefore, there’s no there’s no \(m_{1}, m_{2},c\in \R\) so that the \(x\)-axis is the solution set to the equation \(z = m_{1}x + m_{2}y+ c\text{.}\)