Open main menu
Home
Random
Recent changes
Special pages
Community portal
Preferences
About Wikipedia
Disclaimers
Incubator escapee wiki
Search
User menu
Talk
Dark mode
Contributions
Create account
Log in
Editing
System of linear equations
(section)
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
==Solving a linear system== There are several [[algorithm]]s for [[equation solving|solving]] a system of linear equations. ===Describing the solution=== When the solution set is finite, it is reduced to a single element. In this case, the unique solution is described by a sequence of equations whose left-hand sides are the names of the unknowns and right-hand sides are the corresponding values, for example <math>(x=3, \;y=-2,\; z=6)</math>. When an order on the unknowns has been fixed, for example the [[alphabetical order]] the solution may be described as a [[vector space|vector]] of values, like <math>(3, \,-2,\, 6)</math> for the previous example. To describe a set with an infinite number of solutions, typically some of the variables are designated as '''free''' (or '''independent''', or as '''parameters'''), meaning that they are allowed to take any value, while the remaining variables are '''dependent''' on the values of the free variables. For example, consider the following system: :<math>\begin{alignat}{7} x &&\; + \;&& 3y &&\; - \;&& 2z &&\; = \;&& 5 & \\ 3x &&\; + \;&& 5y &&\; + \;&& 6z &&\; = \;&& 7 & \end{alignat}</math> The solution set to this system can be described by the following equations: :<math>x=-7z-1\;\;\;\;\text{and}\;\;\;\;y=3z+2\text{.}</math> Here ''z'' is the free variable, while ''x'' and ''y'' are dependent on ''z''. Any point in the solution set can be obtained by first choosing a value for ''z'', and then computing the corresponding values for ''x'' and ''y''. Each free variable gives the solution space one [[degree of freedom]], the number of which is equal to the [[dimension]] of the solution set. For example, the solution set for the above equation is a line, since a point in the solution set can be chosen by specifying the value of the parameter ''z''. An infinite solution of higher order may describe a plane, or higher-dimensional set. Different choices for the free variables may lead to different descriptions of the same solution set. For example, the solution to the above equations can alternatively be described as follows: :<math>y=-\frac{3}{7}x + \frac{11}{7}\;\;\;\;\text{and}\;\;\;\;z=-\frac{1}{7}x-\frac{1}{7}\text{.}</math> Here ''x'' is the free variable, and ''y'' and ''z'' are dependent. ===Elimination of variables=== The simplest method for solving a system of linear equations is to repeatedly eliminate variables. This method can be described as follows: # In the first equation, solve for one of the variables in terms of the others. # Substitute this expression into the remaining equations. This yields a system of equations with one fewer equation and unknown. # Repeat steps 1 and 2 until the system is reduced to a single linear equation. # Solve this equation, and then back-substitute until the entire solution is found. For example, consider the following system: : <math>\begin{cases} x+3y-2z=5\\ 3x+5y+6z=7\\ 2x+4y+3z=8 \end{cases}</math> Solving the first equation for ''x'' gives <math>x=5+2z-3y</math>, and plugging this into the second and third equation yields : <math>\begin{cases} y=3z+2\\ y=\tfrac{7}{2}z+1 \end{cases}</math> Since the LHS of both of these equations equal ''y'', equating the RHS of the equations. We now have: : <math>\begin{align} 3z+2=\tfrac{7}{2}z+1\\ \Rightarrow z=2 \end{align}</math> Substituting ''z'' = 2 into the second or third equation gives ''y'' = 8, and the values of ''y'' and ''z'' into the first equation yields ''x'' = −15. Therefore, the solution set is the ordered triple <math>(x,y,z)=(-15,8,2) </math>. ===Row reduction=== {{Main|Gaussian elimination}} In '''row reduction''' (also known as '''Gaussian elimination'''), the linear system is represented as an [[augmented matrix]]{{sfnp|Beauregard|Fraleigh|1973|p=68}} :<math>\left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 3 & 5 & 6 & 7 \\ 2 & 4 & 3 & 8 \end{array}\right]\text{.} </math> This matrix is then modified using [[elementary row operations]] until it reaches [[reduced row echelon form]]. There are three types of elementary row operations:{{sfnp|Beauregard|Fraleigh|1973|p=68}} :'''Type 1''': Swap the positions of two rows. :'''Type 2''': Multiply a row by a nonzero [[scalar (mathematics)|scalar]]. :'''Type 3''': Add to one row a scalar multiple of another. Because these operations are reversible, the augmented matrix produced always represents a linear system that is equivalent to the original. There are several specific algorithms to row-reduce an augmented matrix, the simplest of which are [[Gaussian elimination]] and [[Gauss–Jordan elimination]]. The following computation shows Gauss–Jordan elimination applied to the matrix above: :<math>\begin{align}\left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 3 & 5 & 6 & 7 \\ 2 & 4 & 3 & 8 \end{array}\right]&\sim \left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 0 & -4 & 12 & -8 \\ 2 & 4 & 3 & 8 \end{array}\right]\sim \left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 0 & -4 & 12 & -8 \\ 0 & -2 & 7 & -2 \end{array}\right]\sim \left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 0 & 1 & -3 & 2 \\ 0 & -2 & 7 & -2 \end{array}\right] \\ &\sim \left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 0 & 1 & -3 & 2 \\ 0 & 0 & 1 & 2 \end{array}\right]\sim \left[\begin{array}{rrr|r} 1 & 3 & -2 & 5 \\ 0 & 1 & 0 & 8 \\ 0 & 0 & 1 & 2 \end{array}\right]\sim \left[\begin{array}{rrr|r} 1 & 3 & 0 & 9 \\ 0 & 1 & 0 & 8 \\ 0 & 0 & 1 & 2 \end{array}\right]\sim \left[\begin{array}{rrr|r} 1 & 0 & 0 & -15 \\ 0 & 1 & 0 & 8 \\ 0 & 0 & 1 & 2 \end{array}\right].\end{align}</math> The last matrix is in reduced row echelon form, and represents the system {{nowrap|''x'' {{=}} −15}}, {{nowrap|''y'' {{=}} 8}}, {{nowrap|''z'' {{=}} 2}}. A comparison with the example in the previous section on the algebraic elimination of variables shows that these two methods are in fact the same; the difference lies in how the computations are written down. ===Cramer's rule=== {{Main|Cramer's rule}} '''Cramer's rule''' is an explicit formula for the solution of a system of linear equations, with each variable given by a quotient of two [[determinant]]s.{{sfnp|Sterling|2009|p=[https://books.google.com/books?id=PsNJ1alC-bsC&pg=PA235 235]}} For example, the solution to the system :<math>\begin{alignat}{7} x &\; + &\; 3y &\; - &\; 2z &\; = &\; 5 \\ 3x &\; + &\; 5y &\; + &\; 6z &\; = &\; 7 \\ 2x &\; + &\; 4y &\; + &\; 3z &\; = &\; 8 \end{alignat}</math> is given by :<math> x=\frac {\, \begin{vmatrix}5&3&-2\\7&5&6\\8&4&3\end{vmatrix} \,} {\, \begin{vmatrix}1&3&-2\\3&5&6\\2&4&3\end{vmatrix} \,} ,\;\;\;\; y=\frac {\, \begin{vmatrix}1&5&-2\\3&7&6\\2&8&3\end{vmatrix} \,} {\, \begin{vmatrix}1&3&-2\\3&5&6\\2&4&3\end{vmatrix} \,} ,\;\;\;\; z=\frac {\, \begin{vmatrix}1&3&5\\3&5&7\\2&4&8\end{vmatrix} \,} {\, \begin{vmatrix}1&3&-2\\3&5&6\\2&4&3\end{vmatrix} \,}. </math> For each variable, the denominator is the determinant of the [[matrix of coefficients]], while the numerator is the determinant of a matrix in which one column has been replaced by the vector of constant terms. Though Cramer's rule is important theoretically, it has little practical value for large matrices, since the computation of large determinants is somewhat cumbersome. (Indeed, large determinants are most easily computed using row reduction.) Further, Cramer's rule has very poor numerical properties, making it unsuitable for solving even small systems reliably, unless the operations are performed in rational arithmetic with unbounded precision.{{Citation needed|date=March 2017}} ===Matrix solution=== If the equation system is expressed in the matrix form <math>A\mathbf{x}=\mathbf{b}</math>, the entire solution set can also be expressed in matrix form. If the matrix ''A'' is square (has ''m'' rows and ''n''=''m'' columns) and has full rank (all ''m'' rows are independent), then the system has a unique solution given by : <math>\mathbf{x}=A^{-1}\mathbf{b}</math> where <math>A^{-1}</math> is the [[matrix inverse|inverse]] of ''A''. More generally, regardless of whether ''m''=''n'' or not and regardless of the rank of ''A'', all solutions (if any exist) are given using the [[Moore–Penrose inverse]] of ''A'', denoted <math>A^+</math>, as follows: : <math>\mathbf{x}=A^+ \mathbf{b} + \left(I - A^+ A\right)\mathbf{w}</math> where <math>\mathbf{w}</math> is a vector of free parameters that ranges over all possible ''n''×1 vectors. A necessary and sufficient condition for any solution(s) to exist is that the potential solution obtained using <math>\mathbf{w}=\mathbf{0}</math> satisfy <math>A\mathbf{x}=\mathbf{b}</math> — that is, that <math>AA^+ \mathbf{b}=\mathbf{b}.</math> If this condition does not hold, the equation system is inconsistent and has no solution. If the condition holds, the system is consistent and at least one solution exists. For example, in the above-mentioned case in which ''A'' is square and of full rank, <math>A^+</math> simply equals <math>A^{-1}</math> and the general solution equation simplifies to : <math>\mathbf{x}=A^{-1}\mathbf{b} + \left(I - A^{-1}A\right)\mathbf{w} = A^{-1}\mathbf{b} + \left(I-I\right)\mathbf{w} = A^{-1}\mathbf{b}</math> as previously stated, where <math>\mathbf{w}</math> has completely dropped out of the solution, leaving only a single solution. In other cases, though, <math>\mathbf{w}</math> remains and hence an infinitude of potential values of the free parameter vector <math>\mathbf{w}</math> give an infinitude of solutions of the equation. ===Other methods=== {{Further|Numerical solution of linear systems}} While systems of three or four equations can be readily solved by hand (see [[Cracovian]]), computers are often used for larger systems. The standard algorithm for solving a system of linear equations is based on Gaussian elimination with some modifications. Firstly, it is essential to avoid division by small numbers, which may lead to inaccurate results. This can be done by reordering the equations if necessary, a process known as [[Pivot element|''pivoting'']]. Secondly, the algorithm does not exactly do Gaussian elimination, but it computes the [[LU decomposition]] of the matrix ''A''. This is mostly an organizational tool, but it is much quicker if one has to solve several systems with the same matrix ''A'' but different vectors '''b'''. If the matrix ''A'' has some special structure, this can be exploited to obtain faster or more accurate algorithms. For instance, systems with a [[symmetric matrix|symmetric]] [[positive-definite matrix|positive definite]] matrix can be solved twice as fast with the [[Cholesky decomposition]]. [[Levinson recursion]] is a fast method for [[Toeplitz matrix|Toeplitz matrices]]. Special methods exist also for matrices with many zero elements (so-called [[sparse matrix|sparse matrices]]), which appear often in applications. A completely different approach is often taken for very large systems, which would otherwise take too much time or memory. The idea is to start with an initial approximation to the solution (which does not have to be accurate at all), and to change this approximation in several steps to bring it closer to the true solution. Once the approximation is sufficiently accurate, this is taken to be the solution to the system. This leads to the class of [[iterative method]]s. For some sparse matrices, the introduction of randomness improves the speed of the iterative methods.<ref>{{cite news |last1=Hartnett |first1=Kevin |title=New Algorithm Breaks Speed Limit for Solving Linear Equations |url=https://www.quantamagazine.org/new-algorithm-breaks-speed-limit-for-solving-linear-equations-20210308/ |access-date=March 9, 2021 |work=[[Quanta Magazine]] |date=March 8, 2021}}</ref> One example of an iterative method is the [[Jacobi method]], where the matrix <math>A</math> is split into its diagonal component <math>D</math> and its non-diagonal component <math>L+U</math>. An initial guess <math>{\bold x}^{(0)}</math> is used at the start of the algorithm. Each subsequent guess is computed using the iterative equation: : <math>{\bold x}^{(k+1)} = D^{-1}({\bold b} - (L+U){\bold x}^{(k)})</math> When the difference between guesses <math>{\bold x}^{(k)}</math> and <math>{\bold x}^{(k+1)}</math> is sufficiently small, the algorithm is said to have ''converged'' on the solution.<ref>{{cite web |url=https://mathworld.wolfram.com/JacobiMethod.html |title=Jacobi Method }}</ref> There is also a [[quantum algorithm for linear systems of equations]].{{sfnp|Harrow|Hassidim|Lloyd|2009}}
Edit summary
(Briefly describe your changes)
By publishing changes, you agree to the
Terms of Use
, and you irrevocably agree to release your contribution under the
CC BY-SA 4.0 License
and the
GFDL
. You agree that a hyperlink or URL is sufficient attribution under the Creative Commons license.
Cancel
Editing help
(opens in new window)