A linear equation is an equation that may be put in the form:
a 11 x 1 + a 12 x 2 + a 13 x 3 + ⋯ + a 1 n x n = b 1 a_{11}x_1 + a_{12}x_2 + a_{13}x_3 + \cdots + a_{1n}x_n = b_1
a 11 x 1 + a 12 x 2 + a 13 x 3 + ⋯ + a 1 n x n = b 1
where x 1 , x 2 , x 3 , ⋯ , x n x_1, x_2, x_3, \cdots, x_n x 1 , x 2 , x 3 , ⋯ , x n are variables, a a a ’s are coefficients, and b 1 b_1 b 1 is constant.
A system of linear equations (or linear system) is a collection of linear equations involving the same set of variables.
A general system of n n n linear equations with n n n unknowns can be written as:
a 11 x 1 + a 12 x 2 + a 13 x 3 + ⋯ + a 1 n x n = b 1 a 21 x 1 + a 22 x 2 + a 23 x 3 + ⋯ + a 2 n x n = b 2 a 31 x 1 + a 32 x 2 + a 33 x 3 + ⋯ + a 3 n x n = b 3 ⋮ a n 1 x 1 + a n 2 x 2 + a n 3 x 3 + ⋯ + a n n x n = b n \begin{aligned}
a_{11}x_1 + a_{12}x_2 + a_{13}x_3 + \cdots + a_{1n}x_n &= b_1 \\
a_{21}x_1 + a_{22}x_2 + a_{23}x_3 + \cdots + a_{2n}x_n &= b_2 \\
a_{31}x_1 + a_{32}x_2 + a_{33}x_3 + \cdots + a_{3n}x_n &= b_3 \\
\vdots \\
a_{n1}x_1 + a_{n2}x_2 + a_{n3}x_3 + \cdots + a_{nn}x_n &= b_n
\end{aligned}
a 11 x 1 + a 12 x 2 + a 13 x 3 + ⋯ + a 1 n x n a 21 x 1 + a 22 x 2 + a 23 x 3 + ⋯ + a 2 n x n a 31 x 1 + a 32 x 2 + a 33 x 3 + ⋯ + a 3 n x n ⋮ a n 1 x 1 + a n 2 x 2 + a n 3 x 3 + ⋯ + a nn x n = b 1 = b 2 = b 3 = b n
It can also be written in matrix form:
A X ⃗ = B ⃗ A \vec{X} = \vec{B}
A X = B
where A A A is called the coefficient matrix, X ⃗ \vec{X} X is a variable or unknown matrix, and B ⃗ \vec{B} B is a constant matrix.
( a 11 a 12 a 13 ⋯ a 1 n a 21 a 22 a 23 ⋯ a 2 n a 31 a 32 a 33 ⋯ a 3 n ⋮ ⋮ ⋮ ⋱ ⋮ a n 1 a n 2 a n 3 ⋯ a n n ) ( x 1 x 2 x 3 ⋮ x n ) = ( b 1 b 2 b 3 ⋮ b n ) \begin{pmatrix}
a_{11} & a_{12} & a_{13} & \cdots & a_{1n} \\
a_{21} & a_{22} & a_{23} & \cdots & a_{2n} \\
a_{31} & a_{32} & a_{33} & \cdots & a_{3n} \\
\vdots & \vdots & \vdots & \ddots & \vdots \\
a_{n1} & a_{n2} & a_{n3} & \cdots & a_{nn}
\end{pmatrix}
\begin{pmatrix}
x_1 \\
x_2 \\
x_3 \\
\vdots \\
x_n
\end{pmatrix}
=
\begin{pmatrix}
b_1 \\
b_2 \\
b_3 \\
\vdots \\
b_n
\end{pmatrix}
a 11 a 21 a 31 ⋮ a n 1 a 12 a 22 a 32 ⋮ a n 2 a 13 a 23 a 33 ⋮ a n 3 ⋯ ⋯ ⋯ ⋱ ⋯ a 1 n a 2 n a 3 n ⋮ a nn x 1 x 2 x 3 ⋮ x n = b 1 b 2 b 3 ⋮ b n
Iterative Techniques
The solution is guessed and then iteratively improved.
Jacobi Method
A slow method for solving systems of linear equations iteratively. An initial guess is made for the solution vector X ⃗ ( 0 ) \vec{X}^{(0)} X ( 0 ) . Then iteratively, subsequent approximations are made using the equation: (x i x_i x i is made subject in i i i -th equation).
x i = ( ∑ j = 1 , j ≠ i n − a i j x j a i i ) + b i a i i , i = 1 , 2 , … , n x_i = \left(\sum_{j=1, j \neq i}^{n} -\frac{a_{ij}x_j}{a_{ii}} \right) + \frac{b_i}{a_{ii}}, \quad i = 1, 2, \ldots, n
x i = j = 1 , j = i ∑ n − a ii a ij x j + a ii b i , i = 1 , 2 , … , n
For each k ≥ 1 k \geq 1 k ≥ 1 , generate the components x i ( k ) x_i^{(k)} x i ( k ) of X ⃗ ( k ) \vec{X}^{(k)} X ( k ) using the components of previous iteration X ⃗ ( k − 1 ) \vec{X}^{(k-1)} X ( k − 1 ) for each i = 1 , 2 , … , n i = 1, 2, \ldots, n i = 1 , 2 , … , n using:
x i ( k ) = 1 a i i [ − ∑ j = 1 , j ≠ i n a i j x j ( k − 1 ) + b i ] x_i^{(k)} = \frac{1}{a_{ii}} \left[ -\sum_{j=1, j \neq i}^{n} a_{ij}x_j^{(k-1)} + b_i \right]
x i ( k ) = a ii 1 − j = 1 , j = i ∑ n a ij x j ( k − 1 ) + b i
Cannot be used for singular matrix.
Diagonally dominant matrix
A n × n A_{n\times n} A n × n is diagonally dominant when ∀ i ∈ { 1 , 2 , … , n } \forall i \in \{1, 2, \ldots, n\} ∀ i ∈ { 1 , 2 , … , n } :
∣ a i i ∣ ≥ ∑ j = 1 , j ≠ i n ∣ a i j ∣ \lvert a_{ii} \rvert \ge \sum_{j=1,j\neq i}^{n} \lvert a_{ij} \rvert
∣ a ii ∣ ≥ j = 1 , j = i ∑ n ∣ a ij ∣ A n × n A_{n\times n} A n × n is strictly diagonally dominant when ∀ i ∈ { 1 , 2 , … , n } \forall i \in \{1, 2, \ldots, n\} ∀ i ∈ { 1 , 2 , … , n } :
∣ a i i ∣ > ∑ j = 1 , j ≠ i n ∣ a i j ∣ \lvert a_{ii} \rvert \gt \sum_{j=1,j\neq i}^{n} \lvert a_{ij} \rvert
∣ a ii ∣ > j = 1 , j = i ∑ n ∣ a ij ∣
Relative error is given by:
Relative error = ∥ X ⃗ ( k ) ∥ − ∥ X ⃗ ( k − 1 ) ∥ ∥ X ⃗ ( k ) ∥ \text{Relative error} =
\frac
{\lVert \vec{X}^{(k)} \rVert - \lVert \vec{X}^{(k - 1)} \rVert}
{\lVert \vec{X}^{(k)} \rVert}
Relative error = ∥ X ( k ) ∥ ∥ X ( k ) ∥ − ∥ X ( k − 1 ) ∥
Here any convenient matrix norms can be used. Usually infinity norm is used.
l ∞ = max { ∑ j = 1 n ∣ a i j ∣ ; i ∈ [ 1 , n ] } l_\infty =
\max
\bigg\{
\sum_{j=1}^{n}
{\lvert a_{ij} \rvert}
\;;\;
i \in [1,n]
\bigg\}
l ∞ = max { j = 1 ∑ n ∣ a ij ∣ ; i ∈ [ 1 , n ] }
Gauss-Seidel Method
An improvement over the Jacobi method. Major difference from Jacobi method is instead of using values from the previous iteration, here newest approximations for x i x_i x i are used.
When computing x i ( k ) x_i^{(k)} x i ( k ) , the components x 1 ( k ) , … , x i − 1 ( k ) x_1^{(k)}, \ldots, x_{i-1}^{(k)} x 1 ( k ) , … , x i − 1 ( k ) would have already been computed. They will be used to compute the x i ( k ) x_i^{(k)} x i ( k ) as follows for each i = 1 , 2 , … , n i = 1, 2, \ldots, n i = 1 , 2 , … , n .
x i ( k ) = 1 a i i [ b i − ∑ j = 1 i − 1 a i j x j ( k ) − ∑ j = i + 1 n a i j x j ( k − 1 ) ] x_i^{(k)} = \frac{1}{a_{ii}} \left[ b_i -\sum_{j=1}^{i-1} a_{ij}x_j^{(k)} - \sum_{j=i+1}^{n} a_{ij}x_j^{(k-1)} \right]
x i ( k ) = a ii 1 [ b i − j = 1 ∑ i − 1 a ij x j ( k ) − j = i + 1 ∑ n a ij x j ( k − 1 ) ]
Note
The equation A X ⃗ + B ⃗ A \vec{X} + \vec{B} A X + B can be converted to the form: X ⃗ = T X ⃗ + C ⃗ \vec{X} = T\vec{X} + \vec{C} X = T X + C .
X ⃗ = ( D − L ) − 1 U X ⃗ + ( D − L ) − 1 B ⃗ k = 1 , 2 , 3 , … \vec{X} = (D - L)^{-1} U\vec{X} + (D - L)^{-1}\vec{B}
\quad k = 1, 2, 3, \ldots
X = ( D − L ) − 1 U X + ( D − L ) − 1 B k = 1 , 2 , 3 , … For D − L D-L D − L to be non-singular, ∀ i ∈ { 1 , 2 , … , n } a i i ≠ 0 \forall i \in \{1,2,\dots,n\}\;\; a_{ii} \neq 0 ∀ i ∈ { 1 , 2 , … , n } a ii = 0 .
Here:
D D D is a diagonal matrix whose entries are of A A A .
− L -L − L is the strictly lower-triangular part of A A A .
− U -U − U is the strictly upper-triangular part of A A A .
Convergence of iterative methods
Theorem 1
Suppose X ⃗ ( 0 ) ∈ R n \vec{X}^{(0)} \in R^n X ( 0 ) ∈ R n . And X ⃗ ( k ) = T X ⃗ ( k − 1 ) + C ⃗ , for each k ≥ 1 \vec{X}^{(k)} = T \vec{X}^{(k-1)} + \vec{C}, \quad \text{for each } k \geq 1 X ( k ) = T X ( k − 1 ) + C , for each k ≥ 1 .
The sequence X ⃗ ( k ) k = 0 ∞ {\vec{X}^{(k)}}_{k=0}^\infty X ( k ) k = 0 ∞ converges to the unique solution of X ⃗ = T X ⃗ + C ⃗ \vec{X} = T\vec{X} + \vec{C} X = T X + C iff ρ ( T ) < 1 \rho(T) < 1 ρ ( T ) < 1 . Here ρ \rho ρ denotes the spectral radius .
If ∣ ∣ T ∣ ∣ < 1 ||T|| < 1 ∣∣ T ∣∣ < 1 for any natural matrix norm and C ⃗ \vec{C} C is a given vector, then for any X ⃗ ( 0 ) ∈ R n \vec{X}^{(0)} \in R^n X ( 0 ) ∈ R n , the sequence converges to a vector X ⃗ ∈ R n \vec{X} \in R^n X ∈ R n , with X ⃗ = T X ⃗ + C ⃗ \vec{X} = T\vec{X} + \vec{C} X = T X + C , and the following error bounds hold:
∥ X ⃗ − X ⃗ ( k ) ∥ ≤ ∥ T ∥ k ⋅ ∥ X ⃗ − X ⃗ ( 0 ) ∥ \lVert\vec{X} - \vec{X}^{(k)}\lVert \leq \lVert T\lVert^k \;\cdot\; \lVert\vec{X} - \vec{X}^{(0)}\lVert
∥ X − X ( k ) ∥ ≤ ∥ T ∥ k ⋅ ∥ X − X ( 0 ) ∥
∥ X ⃗ − X ⃗ ( k ) ∥ ≤ ∥ T ∥ k 1 − ∥ T ∥ ∥ X ⃗ ( 1 ) − X ⃗ ( 0 ) ∥ \lVert\vec{X} - \vec{X}^{(k)}\rVert \leq \frac{\lVert T\rVert^k}{1 - \lVert T\rVert} \lVert\vec{X}^{(1)} - \vec{X}^{(0)}\rVert
∥ X − X ( k ) ∥ ≤ 1 − ∥ T ∥ ∥ T ∥ k ∥ X ( 1 ) − X ( 0 ) ∥
Theorem 2
If A A A is strictly diagonally dominant, then for any choice of X ( 0 ) X^{(0)} X ( 0 ) , both Jacobi and Gauss-Seidel methods gives sequences X ⃗ ( k ) k = 0 ∞ {\vec{X}^{(k)}}_{k=0}^\infty X ( k ) k = 0 ∞ that converge to the unique solution of A X ⃗ = B ⃗ A\vec{X} = \vec{B} A X = B .