Solving Systems of Linear Equations - HMC Calculus Tutorial

Solving Systems of Linear Equations; Row Reduction

Systems of linear equations arise in all sorts of applications in many different fields of study. The method reviewed here can be implemented to solve a linear system

a₁₁x₁

a₁₂x₂

…

a_1nx_n

b₁

a₂₁x₁

a₂₂x₂

…

a_2nx_n

b₂

^··_·

a_m1z₁

a_m2x₂

…

a_mnx_n

b_m

of any size. We write this system in matrix form as

/
|
|
|
|
|
|
\

a₁₁

a₁₂

…

a_1n

a₂₁

a₂₂

…

a_2n

^··_·

a_m1

a_m2

…

a_mn

\
|
|
|
|
|
|
/

/
|
|
|
|
|
|
\

x₁

x₂

x_n

\
|
|
|
|
|
|
/

/
|
|
|
|
|
|
\

b₁

b₂

b_m

\
|
|
|
|
|
|
/

That is, AX = B.

We can capture all the information contained in the system in the single augmented matrix (matrice estesa, indicabile con A|B)

/
|
|
|
|
|
|
\

a₁₁

a₁₂

…

a_1n

b₁

a₂₁

a₂₂

…

a_2n

b₂

^··_·

a_m1

a_m2

…

a_mn

b_m

\
|
|
|
|
|
|
/

We will solve the original system of linear equations by performing a sequence of the following elementary row operations on the augmented matrix:

Elementary Row Operations

Interchange two rows.

Multiply one row by a nonzero number.

Add a multiple of one row to a different row.

Do you see how we are manipulating the system of linear equations by applying each of these operations?

When a sequence of elementary row operations is performed on an augmented matrix, the linear system that corresponds to the resulting augmented matrix is equivalent to the original system. That is, the resulting system has the same solution set as the original system. Our strategy in solving linear systems, therefore, is to take an augmented matrix for a system and carry it by means of elementary row operations to an equivalent augmented matrix from which the solutions of the system are easily obtained. In particular, we bring the augmented matrix to Row-Echelon Form:

Row-Echelon Form (normed upper triangular matrix - matrice con righe a scalini, o matrice triangolare alta, normalizzata; se si toglie la condizione che gli elementi in testa alle righe siano 1 si toglie l'aggettivo "normalizzata")

A matrix is said to be in row-echelon form if

All rows consisting entirely of zeros are at the bottom.
In each row, the first non-zero entry form the left (i.e. the pivot) is a 1, called the leading 1.
The leading 1 in each row is to the right of all leading 1's in the rows above it.

If, in addition, each leading 1 is the only non-zero entry in its column, then the matrix is in reduced row-echelon form (rref). (se si toglie la condizione che gli elementi in testa alle righe siano 1, si hanno le matrici in forma diagonale).

It can be proven that every matrix can be brought to row-echelon form (and even to reduced row-echelon form) by the use of elementary row operations. At that point, the solutions of the system are easily obtained.

In the following example, suppose that each of the matrices was the result of carrying an augmented matrix to reduced row-echelon form by means of a sequence of row operations.

Example

The augmented matrix

A₁ =

/
|
|
|
|
\

-4

\
|
|
|
|
/

in reduced row-echelon form corresponds to the system

x₁

x₂

x₃

-4

which is already fully solved!

The augmented matrix

A₂ =

/
|
|
|
|
\

-3

-5

\
|
|
|
|
/

also in reduced row-echelon form, corresponds to the system

x₁

-3x₃

-5

x₂

+2x₃

Letting x₃ = t, we find that x₂ = -2t+4 and x₁ = 3t-5. Thus, the system has infinitely many solutions, parametrized for all t as

/
|
|
|
|
\

x₁

x₂

x₃

\
|
|
|
|
/

/
|
|
|
|
\

3t-5

-2t+4

\
|
|
|
|
/

Finally, the augmented matrix

A₃ =

/
|
|
|
|
\

\
|
|
|
|
/

again in reduced row-echelon form, corresponds to the system

x₁

x₂

which clearly has no solution. The system is inconsistent.

Notes

If a matrix A is carried to row-echelon form by means of elementary row operations, the number of leading 1's in the resulting matrix is called the rank ρ(A) of the original matrix.
Ilrango è uguale al massimo ordine dei minori diversi da zero, ed è anche uguale al numero massimo delle righe (ovvero delle colonne) lineramente indipendenti, sia che la matrice sia quadrata (n.equaz. = m = n = n.incognite) sia che non lo sia.
[un minore di A è il determinante di una quadrata ottenuta da A elminando alcune (eventualmente zero) righe e alcune (eventualmente zero) colonne].
Suppose that a system of linear equations in n unknowns has a solution. Then the set of solutions has n-r parameters, where r is the rank of the augmented matrix.
Ovviamente, ρ(A|B) ≥ ρ(A). Come si può capire riflettendo sull'esempio precedente, il sistema ammette soluzioni sse ρ(A|B) = ρ(A) (teorema di Rouché-Capelli); quindi i parametri dell'insieme delle soluzioni sono n–ρ(A), e si ha un'unica n-upla soluzione quando n = ρ(A).
Suppose that A is an n×n invertible matrix. Then the system AX = B has a unique solution given by X = A^-1B. That is, the reduced row-echelon augmented matrix will be of the form

[ I | A^-1B ].

The following algorithm takes augmented matrices to row-echelon form:
[l'idea è semplice: eliminare via via la prima incognita nella righe successive, in modo da avere alla fine un sistema in cui, passando da una riga alla seguente, diminuisca (o non aumenti) il numero delle incognite; e per eliminare tale incognita sostituire ciascuna delle righe successive con un'opportuna combinazione lineare di essa con la riga attuale]

Gaussian Elimination

If the matrix is already in row-echelon form, then stop.
Otherwise, find the first column from the left with a non-zero entry a and move the row containing that entry to the top of the rows being worked on.
Multiply that row by 1/a to create a leading 1.
Subtract multiples of that row from the rows below it to make each entry below the leading 1 zero. We are now done working on that row.
Repeat steps 1-4 on the rows still being worked on.

Notes

In practice, you have some flexibility in the application of the algorithm. For instance, in Step 2 you often have a choice of rows to move to the top.
A more computationally-intensive algorithm that takes a matrix to reduced row-echelon form is given by the Gauss-Jordan Reduction.
Una matrice quadrata del tipo seguente viene detta matrice di Jordan:
a  1  0  0 ... 0  0
0  a  1  0 ... 0  0
0  0  a  1 ... 0  0
...
0  0  0  0 ... a  1
0  0  0  0 ... 0  a

Example

We will use Gaussian Elimination to solve the linear system

x₁

2x₂

3x₃

2x₁

x₂

x₃

3x₁

x₃

The augmented matrix is

/
|
|
|
|
\

-1

\
|
|
|
|
/

The Gaussian Elimination algorithm proceeds as follows:

/
|
|
|
|
\

1
2
3
9

2
-1
1
8

3
0
-1
3
\
|
|
|
|
/

(Row 1)
(Row 2)
(Row 3)
→ /
|
|
|
|
\

1
2
3
9

0
-5
-5
-10

0
-6
-10
-24
\
|
|
|
|
/

(Row 1)
(Row 2-2·Row 1)
(Row 3-3·Row 1)

→ /
|
|
|
|
\

1
2
3
9

0
1
1
2

0
-6
-10
-24
\
|
|
|
|
/

(Row 1)
(-1/5·Row 2)
(Row 3)

→ /
|
|
|
|
\

1
2
3
9

0
1
1
2

0
0
-4
-12
\
|
|
|
|
/

(Row 1)
(Row 2)
(Row 3+6·Row 2)

→ /
|
|
|
|
\

1
2
3
9

0
1
1
2

0
0
1
3
\
|
|
|
|
/

(Row 1)
(Row 2)
(-1/4·Row 3)

We have brought the matrix to row-echelon form. The corresponding system

x₁

2x₂

3x₃

x₂

x₃

is easily solved from the bottom up:

x₃ = 3

x₂+3 = 2 → x₂ = -1

x₁ +2(-1)+3(3) = 9 → x₁ = 2.

Thus, the solution of the original system is x₁ = 2, x₂ = -1, x₃ = 3.

Nota. La matrice
/ 1 2 3 \ | 2 -1 1 | \ 3 0 -1 /
con la "eliminazione di G." assume la forma triangolare alta:
/ 1 2 3 \ | 0 1 1 | \ 0 0 1 /

• Le operazioni che si fanno sulle righe possono invertire il segno del determinante (se ci sono scambi) o moltiplicarlo per una costante (se una riga viene moltiplicata per essa); la somma a una riga di un'altra riga, anche se moltiplicata per una costante, non modifica il determinante (cambia solo se si moltiplica la riga su cui si opera).
• Una matrice triangolare alta ha come determinante (come è facile verificare) il prodotto degli elementi della diagonale.
Quindi l'eleminazione di G. può essere usata per determinare il segno del determinante o il suo stesso valore:
• il determinante di una matrice è nullo sse la matrice ottenuta da essa con l'eliminazione di G. ha un elemento nullo sulla diagonale;
• e il suo determinante è il prodotto degli elementi della diagonale della nuova matrice modificato sulla base delle operazioni fatte sulle righe.
Nel nostro esempio durante il procedimento vi sono state una moltiplicazione per -1/4 e una per -1/5, la matrice finale ha determinante 1 e, quindi, quella inziale ha determinante 1·20 = 20.
Se la matrice finale avesse avuto, ad esempio, esattamente 2 elementi della diagonale diversi da zero, avremmo potuto concludere che il suo rango è 2.
• Se ci interessa solo trovare il determinante o il rango di una matrice non è necessario trasformare in 1 gli elementi in testa alle righe (ossia "normalizzare"):

vediamolo sulla stessa matrice
/ 1 2 3 \ | 2 -1 1 | \ 3 0 -1 /
Riga2 = Riga2 - 2Riga1 →
Riga3 = Riga3 - 3Riga1
/ 1 2 3 \ | 0 -5 -5 | \ 0 -6 -10 /

Riga3 = 5Riga3 - 6Riga2 →
/ 1 2 3 \ | 0 -5 -5 | \ 0 0 -20 /
Il rango è 3. Il determinante è 1·(-5)(-20)/5 = 20.
[ho diviso per 5 in quanto operando sulla riga 3 l'ho moltiplicata per 5]

Altri esempi
• Nel caso del sistema di equazioni a fianco, che contiene coefficienti dipendenti dalla costante c, triangolarizzando troviamo che il determinante è nullo per c pari a 0, 1 o -1.
Per c diverso da questi valori ρ(A) = 3, per cui il sistema ha un'unica soluzione:
cz = 0 & (c+1)y+cz = 0 & (c-1)x+(c+1)y+cz = c
z = 0 & (c+1)y = 0 & (c-1)x+(c+1)y = c
z = 0 & y = 0 & (c-1)x = c
x = c/(c-1) & y = 0 & z = 0
La soluzione è (c/(c-1), 0, 0).
Fissato c = 0, ρ(A) = ρ(A|B) = 2; il sistema ha infinite soluzioni al variare di 3-2 = 1 parametro:
-x + y = 0 & y = 0
x = 0 & y = 0
Sono soluzioni (0, 0, z) qualunque sia z.
Fissato c = -1, ρ(A) = ρ(A|B) = 2; il sistema ha infinite soluzioni al variare di 3-2 = 1 parametro:
-2x - z = -1 & -y = 0
x = & y = 0
Sono soluzioni (0, 0, z) qualunque sia z.
Fissato c = 1, ρ(A) = 2 < 3 = ρ(A|B); il sistema non ammette soluzioni; e, infatti, avremmo:
2y + z = -1 & 2y + z = 0 ( & z = 1)
-1 = 0
(c-1)x + (c+1)y + cz = c (c+1)y + cz = 0 (c-1)x + cz = c /c-1 c+1 c | c\ /c-1 c+1 c |c\ | 0 c+1 c | 0| -> | 0 c+1 c |0| \c-1 0 c | c/ \ 0 -c-1 0 |0/ /c-1 c+1 c |c\ -> | 0 c+1 c |0| det(A) = \ 0 0 c |0/ c(c-1)(c+1) c=0 /-1 1 0 | 0\ | 0 1 0 | 0| ρ(A)=2=ρ(A|B) \ 0 0 0 | 0/ c=1 /0 2 1 | 1\ |0 2 1 | 0| ρ(A)=2 \0 0 1 | 1/ ρ(A|B)=3 c=-1 /-2 0 -1 |-1\ | 0 0 -1 | 0| ρ(A)=2=ρ(A|B) \-2 0 -1 |-1/

cx - y - z = c x + cy = 0 cx + (c+2)y = 0 (c+2)x + cz = 0 / c -1 -1 | c\ | 1 c 0 | 0| | c c+2 0 | 0| \c+2 0 c | 0/ / 1 c 0\ | c c+2 0| \c+2 0 c/ det(A|B)=-c*c*(c+2-c^2)
• Il sistema di equazioni a fianco ha 4 equazioni. Se penso c come costante e x, y e z come incognite, si tratta di un sistema lineare. Quando ha soluzioni?
La matrice estesa A|B è quadrata, 4×4. Se avesse determinante diverso da 0 avremmo ρ(A|B)=4 e non potremmo avere soluzioni (in quanto sicuramente ρ(A)<4 avendo A solo 3 colonne).
Calcoliamolo. Ci conviene procedere senza triangolarizzare: è pari a -c (elemento di posto (1,4) cambiato di segno in quanto ci arrivo con un numero dispari di spostamenti) per il minore che si ottiene togliendo riga 1 e colonna 4, che a sua volta è uguale a c (elemento di posto (3,3) della sottomatrice) per il rimanente minore 2×2: vedi a lato.
det(A|B) = c²(c²-c-2) = c²(c+1)(c-2).
Se c è diverso da 0, da -1 o da 2 sicuramente il sistema non ha soluzioni. Verifichiamo questi casi: ci saranno soluzione se ρ(A|B) = ρ(A).

Per c = 0 A ha le prime 3 righe indipendenti, e quindi rango 3; lo stesso accade per la matrice estesa. Quindi ci sono soluzioni, anzi una soluzione (essendo il rango uguale al numero delle icognite). E si tratta evidentemente della soluzione (0,0,0).
Per c = -1 le prime 2 e la 4 riga sono indipendeti, e quindi anche in questo caso si ha una sola soluzione: si ha facilmente che è (1/3,1/3,1/3).
Analogamente per c = 2 si ha una sola soluzione, (4/9,-2/9,-8/9).

• Nel caso del sistema di equazioni a fianco, che contiene coefficienti dipendenti dalla costante k, la matrice A dei coefficienti ha l'ultima colonna (i coefficienti di z) che non dipendono da k; conviene allora scambiare le colonne C1 e C3 in modo da semplificare i calcoli. Ciò corrisponde a scambiare le incognite x e z, e dovrò ricordarmelo se dovessi risolvere il sistema (e non solo stabilire quando è risolubile).
Fatto lo scambio C1 ↔ C3, faccio lo scambio R1 ↔ R2 in modo che la prima riga non inizi con 0.
Poi procedo con R3 ← R3 – R1 e, infine, con R3 ← R3 – (2–k)R2.
det(A) = k²-k-2 = (k+1)(k-2) = 0 per k = -1 e per k = 2.
Per k ≠ -1 e per k ≠ 2 il sistema ha un'unica soluzione (x = 3/(k²-k-2), y = ..., z = ...).
Per k = 2 l'ultima riga è (0 0 0 | 3), a cui corrisponderebbe 0·x = 3, e quindi il sistema no ha soluzioni (lo potrei giustificare anche con il teorema di Rouché-Capelli: ρ(A)=2, ρ(A|B)=3).
Per k = -1 l'ultima riga è (0 0 0 | 0), a cui corrisponde 0·x = 0, e quindi il sistema ha infinite soluzioni (y = 1+x, z = 1-2x, x qualunque).
kx + y = 1 2x + (k+1)y + z = 1 kx + 3y + z = 4 /k 1 0 | 1\ /0 1 k | 1\ |2 k+1 1 | 1| -> |1 k+1 2 | 1| \k 3 1 | 4/ \1 3 k | 4/ /1 k+1 2 | 1\ -> |0 1 k | 1| \1 3 k | 4/ /1 k+1 2 | 1\ -> |0 1 k | 1| \0 2-k k-2| 3/ /1 k+1 2 | 1 \ -> |0 1 k | 1 | \0 0 k^2-k-2|k+1/

• L'eliminazione di Gauss fornisce anche un algoritmo per invertire le matrici. Un esercizio come esempio:
data la matrice A dipendente dal parametro k scritta a destra,
(1) dire per quali valori di k essa è invertibile e
(2) per k = 0 determinare esplicitamente la matrice inversa A^-1.
(1) Ci conviene calcolare il determinante lungo C3 (terza colonna):
det(A) = 3·det([ [k,2]^t | [1,1]^t ]) = 3(k-2). Quindi la matrice è invertibile per k ≠ 2.
(2) Devo trovare B tale che B×A = I.
Se B1, B2 e B3 sono le colonne, incognite, di B, ossia se B = [B1|B2|B3],
devo risolvere i tre sistemi A×Bi scritti a lato, che, applicando l'algoritmo di Gauss, vengono ricondotti alle matrici:
[I | B1], [I | B2], [I | B3]. I tre sistemi hanno la stessa matrice dei coefficienti, quindi risolverli equivale a
ottenere [I | B] a partire da [A | I].
A lato è sviluppato il procedimento.
/0 1 3\ A = |k 1 0| \2 1 0/ /1\ /0\ /0\ A×B1 = |0| A×B2 = |1| A×B3 = |0| \0/ \0/ \1/ A I /0 1 3 | 1 0 0\ |0 1 0 | 0 1 0| R1 <-> R3 -> \2 0 1 | 0 0 1/ /2 0 1 | 0 0 1\ |0 1 0 | 0 1 0| R3 <- R3-R2 -> \0 1 3 | 0 0 0/ /2 0 1 | 0 0 1\ |0 1 0 | 0 1 0| R3 <- (1/3)R3 -> \0 0 3 | 0 -1 0/ /2 0 1 | 0 0 1\ |0 1 0 | 0 1 0| R1 <- R1-R3 -> \0 0 1 |1/3 -1/3 0/ /2 0 0 |-1/3 1/3 1\ |0 1 0 | 0 1 0| R1 <- (1/2)R1 -> \0 0 1 |1/3 -1/3 0/ /1 0 0 |-1/6 1/6 1/2\ |0 1 0 | 0 1 0 | R1 <- (1/2)R1 -> \0 0 1 | 1/3 -1/3 0 / I B = A^-1

Key Concepts [index]

To solve a system of linear equations, reduce the corresponding augmented matrix to row-echelon form using the Elementary Row Operations:

Interchange two rows.
Multiply one row by a nonzero number.
Add a multiple of one row to a different row.

Gaussian Elimination is one algorithm that reduces matrices to row-echelon form.