Introduction to Matrices
Definition
A matrix \( \mathbf{A} \) is a structured array of elements, either real or complex numbers, arranged in horizontal rows and vertical columns. Formally, an \( m \times n \) matrix is represented as:
where \( a_{ij} \) denotes the element located in the \( i \)-th row and \( j \)-th column. Here, \( m \) is the number of rows (horizontal), and \( n \) is the number of columns (vertical), determining the matrix's dimension as \( m \times n \).
Key Points:
-
Rows and Columns: The rows of a matrix are the horizontal lines of elements, while the columns are the vertical lines. For instance, in a \( 2 \times 3 \) matrix:
\[ \mathbf{B} = \begin{bmatrix} b_{11} & b_{12} & b_{13} \\ b_{21} & b_{22} & b_{23} \end{bmatrix} \]\( \mathbf{B} \) has 2 rows and 3 columns.
-
Dimensions: The dimension \( m \times n \) of a matrix specifies its structure. For example:
-
A \( 1 \times 3 \) matrix:
\[ \mathbf{C} = \begin{bmatrix} c_{11} & c_{12} & c_{13} \end{bmatrix} \]is a single row with three elements. - A \( 3 \times 1 \) matrix:
\[ \mathbf{D} = \begin{bmatrix} d_{11} \\ d_{21} \\ d_{31} \end{bmatrix} \]is a single column with three elements. - A \( 2 \times 2 \) matrix:
\[ \mathbf{E} = \begin{bmatrix} e_{11} & e_{12} \\ e_{21} & e_{22} \end{bmatrix} \]has 2 rows and 2 columns.
-
-
No Vacant Slots: Every position within a matrix must be filled with an element \( a_{ij} \). It is illegal and invalid for any slot in the matrix to be vacant. Each entry is essential, and the matrix is not defined if any element is missing. This requirement distinguishes matrices from sets, as matrices cannot be empty.
-
Minimum Size: The smallest possible matrix is a \( 1 \times 1 \) matrix, such as:
\[ \mathbf{F} = \begin{bmatrix} f_{11} \end{bmatrix} \]which contains a single element.
In summary, a matrix is a well-defined array with every slot filled by an element, and it must have at least one row and one column. The arrangement of elements into rows and columns gives the matrix its dimension, which is a fundamental characteristic of its structure.
Consider a \( 3 \times 4 \) matrix \( \mathbf{G} \), where the matrix has 3 rows and 4 columns. Each element in the matrix is distinct and denoted by specific numerical values. The matrix is represented as:
In this matrix:
- The first row consists of the elements \( 7, 13, 5, \) and \( 9 \).
- The second row contains the elements \( 2, 8, 14, \) and \( 11 \).
- The third row includes the elements \( 4, 1, 12, \) and \( 6 \).
Here, the element \( 8 \) is located in the second row and second column, denoted as \( g_{22} = 8 \). Similarly, \( g_{23} = 14 \) refers to the element in the second row and third column.
A matrix \( \mathbf{A} \) of dimension \( m \times n \) can be compactly represented using the notation:
Here, \( a_{ij} \) denotes the element in the \( i \)-th row and \( j \)-th column of the matrix, where the indices \( i \) and \( j \) satisfy:
This notation indicates that \( \mathbf{A} \) is an \( m \times n \) matrix, with \( m \) rows and \( n \) columns. Each element \( a_{ij} \) is uniquely identified by the pair of indices \( i \) and \( j \), where \( i \) is the row index and \( j \) is the column index.
Constructing a Matrix Using a Given Rule
Construct the matrix \( \mathbf{C} = \left[ c_{ij} \right]_{4 \times 3} \) using the rule \( c_{ij} = i^2 + j^2 \), where \( i \) is the row index and \( j \) is the column index.
Solution:
The matrix \( \mathbf{C} \) is a \( 4 \times 3 \) matrix, meaning it has 4 rows and 3 columns. To determine each element \( c_{ij} \), we apply the given rule \( c_{ij} = i^2 + j^2 \).
Step 1: Calculate the elements of the first row (\( i = 1 \)):
Step 2: Calculate the elements of the second row (\( i = 2 \)):
Step 3: Calculate the elements of the third row (\( i = 3 \)):
Step 4: Calculate the elements of the fourth row (\( i = 4 \)):
Step 5: Write the matrix \( \mathbf{C} \) with the calculated elements:
Thus, the matrix \( \mathbf{C} \) constructed using the rule \( c_{ij} = i^2 + j^2 \) is:
Different Types of Matrices
1. Null Matrix
A Null Matrix (or Zero Matrix) is a matrix in which all elements are zero. For any matrix \( \mathbf{O} = [o_{ij}]_{m \times n} \):
Example:
2. Row Matrix
A Row Matrix is a matrix that consists of a single row. For a row matrix \( \mathbf{R} = [r_{1j}]_{1 \times n} \):
Example:
3. Column Matrix
A Column Matrix is a matrix that consists of a single column. For a column matrix \( \mathbf{C} = [c_{i1}]_{m \times 1} \):
Example: [ \mathbf{C} = \begin{bmatrix} 4 \ 6 \ 8 \end{bmatrix}_{3 \times 1} ]
4. Rectangular Matrix
A Rectangular Matrix is a matrix where the number of rows is not equal to the number of columns, i.e., \( m \neq n \). For a matrix \( \mathbf{A} = [a_{ij}]_{m \times n} \) with \( m \neq n \):
Example:
5. Square Matrix
A Square Matrix is a matrix in which the number of rows equals the number of columns, i.e., \( m = n \). For a square matrix \( \mathbf{B} = [b_{ij}]_{n \times n} \):
Main Diagonal: The main diagonal of a square matrix consists of elements \( b_{ij} \) where \( i = j \). In a square matrix: - Elements above the main diagonal satisfy \( i < j \). - Elements below the main diagonal satisfy \( i > j \).
Example:
In this example, the main diagonal consists of elements \( 1, 5, 9 \), elements above the diagonal are \( 2, 3, 6 \), and elements below the diagonal are \( 4, 7, 8 \).
Further Categorization of Square Matrices
1. Identity Matrix
An Identity Matrix is a square matrix in which all the elements of the main diagonal are 1, and all other elements are 0. For an identity matrix \( \mathbf{I} = [i_{ij}]_{n \times n} \):
Example:
2. Scalar Matrix
A Scalar Matrix is a square matrix in which all the elements of the main diagonal are equal to some constant \( k \), and all other elements are 0. For a scalar matrix \( \mathbf{S} = [s_{ij}]_{n \times n} \):
Example:
3. Diagonal Matrix
A Diagonal Matrix is a square matrix in which all the elements outside the main diagonal are 0. For a diagonal matrix \( \mathbf{D} = [d_{ij}]_{n \times n} \):
In diag notation, a diagonal matrix can be represented as:
Example:
4. Lower Triangular Matrix
A Lower Triangular Matrix is a square matrix in which all the elements above the main diagonal are 0. For a lower triangular matrix \( \mathbf{L} = [l_{ij}]_{n \times n} \):
Example:
5. Upper Triangular Matrix
An Upper Triangular Matrix is a square matrix in which all the elements below the main diagonal are 0. For an upper triangular matrix \( \mathbf{U} = [u_{ij}]_{n \times n} \):
Example:
When comparing matrices, the concept of "greater than" or "less than" does not apply as it does with individual numbers. Instead, matrices are primarily compared based on equality.
Operations on matrices
Matrices are fundamental mathematical objects, much like real numbers. When we define any mathematical object, we also define the operations that can be performed on them. For real numbers, we have operations like addition, subtraction, multiplication, and division. Similarly, matrices have their own set of operations, such as addition, multiplication, and scalar multiplication.
Matrix Equality
Two matrices \( \mathbf{A} = [a_{ij}]_{m \times n} \) and \( \mathbf{B} = [b_{ij}]_{m \times n} \) are said to be equal if and only if:
-
Same Dimensions: The matrices must have the same dimensions, i.e., they must have the same number of rows and the same number of columns. Formally, if \( \mathbf{A} \) is \( m \times n \), then \( \mathbf{B} \) must also be \( m \times n \).
-
Element-wise Equality: Each corresponding element of the two matrices must be equal. That is:
Example of Matrix Equality
Consider two matrices \( \mathbf{A} \) and \( \mathbf{B} \):
Here, both matrices \( \mathbf{A} \) and \( \mathbf{B} \) are \( 2 \times 3 \) matrices, and corresponding elements are equal:
Thus, \( \mathbf{A} = \mathbf{B} \).
Non-Equality
If either the dimensions differ or any corresponding elements do not match, the matrices are not equal. For example, if:
then \( \mathbf{A} \) and \( \mathbf{C} \) are not equal because \( a_{22} = 6 \) and \( c_{22} = 7 \), so \( a_{22} \neq c_{22} \).
Matrix Addition
Matrix addition is one of the fundamental operations defined for matrices, similar to addition for real numbers. When adding matrices, we combine them element-wise, provided they have the same dimensions.
Definition of Matrix Addition
Given two matrices \( \mathbf{A} = [a_{ij}]_{m \times n} \) and \( \mathbf{B} = [b_{ij}]_{m \times n} \), the sum \( \mathbf{C} = \mathbf{A} + \mathbf{B} \) is a matrix \( \mathbf{C} = [c_{ij}]_{m \times n} \) where each element \( c_{ij} \) is given by:
This means that the corresponding elements of \( \mathbf{A} \) and \( \mathbf{B} \) are added together to produce the elements of \( \mathbf{C} \).
Conditions for Matrix Addition
- Same Dimensions: The matrices \( \mathbf{A} \) and \( \mathbf{B} \) must have the same number of rows and columns. Matrix addition is not defined for matrices of different dimensions.
Example of Matrix Addition
Consider the following matrices \( \mathbf{A} \) and \( \mathbf{B} \):
To find the sum \( \mathbf{C} = \mathbf{A} + \mathbf{B} \), we add the corresponding elements:
Properties of Matrix Addition
Matrix addition possesses several important properties that make it a well-defined operation in linear algebra. These properties are analogous to those of addition for real numbers and are essential for understanding how matrices behave under addition.
-
Commutativity
Matrix addition is commutative, which means that the order in which matrices are added does not affect the result. Specifically, for any two matrices \( \mathbf{A} \) and \( \mathbf{B} \) of the same dimensions:
\[ \mathbf{A} + \mathbf{B} = \mathbf{B} + \mathbf{A} \]Example:
Let \( \mathbf{A} = \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \) and \( \mathbf{B} = \begin{bmatrix} 5 & 6 \\ 7 & 8 \end{bmatrix} \). Then:
\[ \mathbf{A} + \mathbf{B} = \begin{bmatrix} 1+5 & 2+6 \\ 3+7 & 4+8 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix} \]\[ \mathbf{B} + \mathbf{A} = \begin{bmatrix} 5+1 & 6+2 \\ 7+3 & 8+4 \end{bmatrix} = \begin{bmatrix} 6 & 8 \\ 10 & 12 \end{bmatrix} \]Since \( \mathbf{A} + \mathbf{B} = \mathbf{B} + \mathbf{A} \), matrix addition is commutative.
-
Associativity
Matrix addition is associative, which means that when adding three matrices, the grouping of the matrices does not affect the result. Specifically, for any three matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \) of the same dimensions:
\[ (\mathbf{A} + \mathbf{B}) + \mathbf{C} = \mathbf{A} + (\mathbf{B} + \mathbf{C}) \]Example:
Let \( \mathbf{A} = \begin{bmatrix} 1 & 0 \\ 2 & 3 \end{bmatrix} \), \( \mathbf{B} = \begin{bmatrix} 4 & 5 \\ 6 & 7 \end{bmatrix} \), and \( \mathbf{C} = \begin{bmatrix} 8 & 9 \\ 10 & 11 \end{bmatrix} \). Then:
\[ (\mathbf{A} + \mathbf{B}) + \mathbf{C} = \begin{bmatrix} 1+4 & 0+5 \\ 2+6 & 3+7 \end{bmatrix} + \mathbf{C} = \begin{bmatrix} 5 & 5 \\ 8 & 10 \end{bmatrix} + \begin{bmatrix} 8 & 9 \\ 10 & 11 \end{bmatrix} = \begin{bmatrix} 13 & 14 \\ 18 & 21 \end{bmatrix} \]\[ \mathbf{A} + (\mathbf{B} + \mathbf{C}) = \mathbf{A} + \begin{bmatrix} 4+8 & 5+9 \\ 6+10 & 7+11 \end{bmatrix} = \mathbf{A} + \begin{bmatrix} 12 & 14 \\ 16 & 18 \end{bmatrix} = \begin{bmatrix} 1+12 & 0+14 \\ 2+16 & 3+18 \end{bmatrix} = \begin{bmatrix} 13 & 14 \\ 18 & 21 \end{bmatrix} \]Since \( (\mathbf{A} + \mathbf{B}) + \mathbf{C} = \mathbf{A} + (\mathbf{B} + \mathbf{C}) \), matrix addition is associative.
-
Additive Identity
The additive identity in matrix addition is the zero matrix. For any matrix \( \mathbf{A} \) of dimension \( m \times n \), there exists a zero matrix \( \mathbf{O} = [0]_{m \times n} \) such that:
\[ \mathbf{A} + \mathbf{O} = \mathbf{O} + \mathbf{A} = \mathbf{A} \]The zero matrix is a matrix where all elements are zero.
Example:
Let \( \mathbf{A} = \begin{bmatrix} 2 & 3 \\ 4 & 5 \end{bmatrix} \) and \( \mathbf{O} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} \). Then:
\[ \mathbf{A} + \mathbf{O} = \begin{bmatrix} 2+0 & 3+0 \\ 4+0 & 5+0 \end{bmatrix} = \begin{bmatrix} 2 & 3 \\ 4 & 5 \end{bmatrix} \]Since \( \mathbf{A} + \mathbf{O} = \mathbf{A} \), the zero matrix acts as the additive identity.
-
Additive Inverse
For any matrix \( \mathbf{A} \), there exists an additive inverse matrix \( -\mathbf{A} \) such that:
\[ \mathbf{A} + (-\mathbf{A}) = (-\mathbf{A}) + \mathbf{A} = \mathbf{O} \]The additive inverse matrix \( -\mathbf{A} \) is obtained by negating each element of \( \mathbf{A} \).
Example:
Let \( \mathbf{A} = \begin{bmatrix} 3 & -2 \\ 4 & 6 \end{bmatrix} \). The additive inverse \( -\mathbf{A} \) is:
\[ -\mathbf{A} = \begin{bmatrix} -3 & 2 \\ -4 & -6 \end{bmatrix} \]Now, adding \( \mathbf{A} \) and \( -\mathbf{A} \):
\[ \mathbf{A} + (-\mathbf{A}) = \begin{bmatrix} 3+(-3) & -2+2 \\ 4+(-4) & 6+(-6) \end{bmatrix} = \begin{bmatrix} 0 & 0 \\ 0 & 0 \end{bmatrix} = \mathbf{O} \]Since \( \mathbf{A} + (-\mathbf{A}) = \mathbf{O} \), \( -\mathbf{A} \) is indeed the additive inverse of \( \mathbf{A} \).
Thus, the sum of matrices \( \mathbf{A} \) and \( \mathbf{B} \) is:
\[ \mathbf{C} = \begin{bmatrix} 8 & 10 & 12 \\ 14 & 16 & 18 \end{bmatrix} \]
Scalar Multiplication of Matrices
Scalar multiplication is an operation where a matrix is multiplied by a scalar (a single number). This operation is analogous to multiplying each element of a vector or a list by a number, but it is applied to all elements of a matrix.
Definition of Scalar Multiplication
Given a matrix \( \mathbf{A} = [a_{ij}]_{m \times n} \) and a scalar \( k \), the scalar multiplication of \( \mathbf{A} \) by \( k \) results in a new matrix \( \mathbf{B} = k\mathbf{A} = [b_{ij}]_{m \times n} \) where each element \( b_{ij} \) is given by:
This means that the scalar \( k \) multiplies every element of the matrix \( \mathbf{A} \).
Example of Scalar Multiplication
Consider a matrix \( \mathbf{A} \) and a scalar \( k = 3 \):
To find the matrix \( \mathbf{B} = 3\mathbf{A} \), multiply each element of \( \mathbf{A} \) by 3:
Thus, the resulting matrix \( \mathbf{B} \) is:
Properties of Scalar Multiplication
Scalar multiplication of matrices has several important properties:
-
Distributivity Over Matrix Addition: [ k(\mathbf{A} + \mathbf{B}) = k\mathbf{A} + k\mathbf{B} ] Where \( \mathbf{A} \) and \( \mathbf{B} \) are matrices of the same dimensions and \( k \) is a scalar.
-
Distributivity Over Scalar Addition: [ (k + l)\mathbf{A} = k\mathbf{A} + l\mathbf{A} ] Where \( k \) and \( l \) are scalars and \( \mathbf{A} \) is a matrix.
-
Associativity of Scalar Multiplication: [ k(l\mathbf{A}) = (kl)\mathbf{A} ] Where \( k \) and \( l \) are scalars and \( \mathbf{A} \) is a matrix.
-
Multiplication by 1: [ 1 \cdot \mathbf{A} = \mathbf{A} ] Where \( 1 \) is the scalar one and \( \mathbf{A} \) is any matrix. This property shows that multiplying any matrix by 1 leaves it unchanged.
-
Multiplication by 0: [ 0 \cdot \mathbf{A} = \mathbf{O} ] Where \( 0 \) is the scalar zero, and \( \mathbf{O} \) is the zero matrix of the same dimensions as \( \mathbf{A} \). This property shows that multiplying any matrix by 0 results in a zero matrix.
Matrix Subtraction
Matrix subtraction is defined as the addition of one matrix to the additive inverse (or negative) of another matrix. If you have two matrices \( \mathbf{A} \) and \( \mathbf{B} \), the subtraction \( \mathbf{A} - \mathbf{B} \) is performed by adding \( \mathbf{A} \) to the negative of \( \mathbf{B} \).
Definition of Matrix Subtraction
Given two matrices \( \mathbf{A} = [a_{ij}]_{m \times n} \) and \( \mathbf{B} = [b_{ij}]_{m \times n} \), the subtraction \( \mathbf{C} = \mathbf{A} - \mathbf{B} \) is defined as:
Where \( -\mathbf{B} = [-b_{ij}]_{m \times n} \) is the matrix obtained by negating each element of \( \mathbf{B} \). The resulting matrix \( \mathbf{C} = [c_{ij}]_{m \times n} \) has elements:
Example of Matrix Subtraction
Consider the matrices \( \mathbf{A} \) and \( \mathbf{B} \):
To find the matrix \( \mathbf{C} = \mathbf{A} - \mathbf{B} \), first determine the negative of \( \mathbf{B} \):
Now, add \( \mathbf{A} \) and \( -\mathbf{B} \):
Perform the element-wise addition:
Thus, the result of the matrix subtraction \( \mathbf{A} - \mathbf{B} \) is:
Matrix Multiplication
Matrix multiplication can initially seem complex, but it becomes clearer when we break it down with a concrete example. Let's consider two matrices \( \mathbf{A} \) and \( \mathbf{B} \) with the following dimensions and elements:
To find the product \( \mathbf{AB} \), we need to perform the following steps:
Step-by-Step Calculation:
-
First row of \( \mathbf{A} \) with all columns of \( \mathbf{B} \):
- First row of \( \mathbf{A} \): \( [1, 2, 3] \)
-
First column of \( \mathbf{B} \): \( [1, 2, 3] \)
\[ \text{Dot product} = 1 \cdot 1 + 2 \cdot 2 + 3 \cdot 3 = 1 + 4 + 9 = 14 \] -
Second column of \( \mathbf{B} \): \( [4, 5, 6] \)
\[ \text{Dot product} = 1 \cdot 4 + 2 \cdot 5 + 3 \cdot 6 = 4 + 10 + 18 = 32 \] -
Resulting first row of \( \mathbf{AB} \): \( [14, 32] \)
-
Second row of \( \mathbf{A} \) with all columns of \( \mathbf{B} \):
- Second row of \( \mathbf{A} \): \( [4, 5, 6] \)
-
First column of \( \mathbf{B} \): \( [1, 2, 3] \)
\[ \text{Dot product} = 4 \cdot 1 + 5 \cdot 2 + 6 \cdot 3 = 4 + 10 + 18 = 32 \] -
Second column of \( \mathbf{B} \): \( [4, 5, 6] \)
\[ \text{Dot product} = 4 \cdot 4 + 5 \cdot 5 + 6 \cdot 6 = 16 + 25 + 36 = 77 \] -
Resulting second row of \( \mathbf{AB} \): \( [32, 77] \)
-
Third row of \( \mathbf{A} \) with all columns of \( \mathbf{B} \):
- Third row of \( \mathbf{A} \): \( [7, 8, 9] \)
-
First column of \( \mathbf{B} \): \( [1, 2, 3] \)
\[ \text{Dot product} = 7 \cdot 1 + 8 \cdot 2 + 9 \cdot 3 = 7 + 16 + 27 = 50 \] -
Second column of \( \mathbf{B} \): \( [4, 5, 6] \)
\[ \text{Dot product} = 7 \cdot 4 + 8 \cdot 5 + 9 \cdot 6 = 28 + 40 + 54 = 122 \] -
Resulting third row of \( \mathbf{AB} \): \( [50, 122] \)
-
Fourth row of \( \mathbf{A} \) with all columns of \( \mathbf{B} \):
- Fourth row of \( \mathbf{A} \): \( [10, 11, 12] \)
-
First column of \( \mathbf{B} \): \( [1, 2, 3] \)
\[ \text{Dot product} = 10 \cdot 1 + 11 \cdot 2 + 12 \cdot 3 = 10 + 22 + 36 = 68 \] -
Second column of \( \mathbf{B} \): \( [4, 5, 6] \)
\[ \text{Dot product} = 10 \cdot 4 + 11 \cdot 5 + 12 \cdot 6 = 40 + 55 + 72 = 167 \] -
Resulting fourth row of \( \mathbf{AB} \): \( [68, 167] \)
The resulting matrix \( \mathbf{AB} \) is:
Understanding the Process
In matrix multiplication, each element of the product matrix \( \mathbf{AB} \) is obtained by taking the dot product of a row vector from \( \mathbf{A} \) with a column vector from \( \mathbf{B} \). This operation is called the inner product.
For matrix multiplication to be valid, the number of columns in \( \mathbf{A} \) must equal the number of rows in \( \mathbf{B} \). If this condition is not met, the matrices are not conformable, meaning they cannot be multiplied.
Defining Conformable Matrices
Matrices \( \mathbf{A} \) and \( \mathbf{B} \) are said to be conformable for multiplication if the number of columns in \( \mathbf{A} \) equals the number of rows in \( \mathbf{B} \). Specifically:
- If \( \mathbf{A} \) has dimensions \( m \times p \) and \( \mathbf{B} \) has dimensions \( p \times n \), then the matrices are conformable.
- The resulting product \( \mathbf{AB} \) will have dimensions \( m \times n \).
Definition of Matrix Multiplication
Let \( \mathbf{A} = [a_{ij}]_{m \times p} \) and \( \mathbf{B} = [b_{jk}]_{p \times n} \) be two conformable matrices. The product \( \mathbf{AB} = [c_{ik}]_{m \times n} \) is defined as:
This means that each element \( c_{ik} \) in the resulting matrix \( \mathbf{AB} \) is the sum of the products of corresponding elements from the \( i \)-th row of \( \mathbf{A} \) and the \( k \)-th column of \( \mathbf{B} \). In other words:
Properties of Matrix Multiplication
Non-Commutativity of Matrix Multiplication
Matrix multiplication is generally not commutative, meaning that the order in which matrices are multiplied affects the result. There are several reasons why this is the case:
-
Non-Conformable Matrices:
One key reason for non-commutativity is that the product \( \mathbf{AB} \) might be defined, but the reverse product \( \mathbf{BA} \) may not be, due to the matrices being non-conformable. For example, if \( \mathbf{A} \) is a \( 3 \times 4 \) matrix and \( \mathbf{B} \) is a \( 4 \times 5 \) matrix, then \( \mathbf{AB} \) is defined and will be a \( 3 \times 5 \) matrix. However, \( \mathbf{BA} \) is not defined because \( \mathbf{B} \) has 5 columns, and \( \mathbf{A} \) has 3 rows, making them non-conformable for multiplication.
-
Different Dimensions:
Even when both \( \mathbf{AB} \) and \( \mathbf{BA} \) are defined, their dimensions may differ, which is another reason they may not be equal. Suppose \( \mathbf{A} \) is an \( m \times n \) matrix, and \( \mathbf{B} \) is an \( n \times m \) matrix. The product \( \mathbf{AB} \) will be an \( m \times m \) matrix, while \( \mathbf{BA} \) will be an \( n \times n \) matrix. If \( m \neq n \), the dimensions of \( \mathbf{AB} \) and \( \mathbf{BA} \) do not match, and they cannot be directly compared.
-
Square Matrices:
Even when both matrices are square (i.e., \( \mathbf{A} \) and \( \mathbf{B} \) are both \( n \times n \) matrices), the products \( \mathbf{AB} \) and \( \mathbf{BA} \) may still not be equal. This is because the elements of the matrices interact differently depending on the order of multiplication.
Example:
Consider the square matrices \( \mathbf{A} \) and \( \mathbf{B} \) where:
\[ \mathbf{A} = \begin{bmatrix} 1 & 2 \\ 0 & 1 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 3 & 0 \\ 4 & 5 \end{bmatrix} \]First, calculate \( \mathbf{AB} \):
\[ \mathbf{AB} = \begin{bmatrix} 1 \cdot 3 + 2 \cdot 4 & 1 \cdot 0 + 2 \cdot 5 \\ 0 \cdot 3 + 1 \cdot 4 & 0 \cdot 0 + 1 \cdot 5 \end{bmatrix} = \begin{bmatrix} 3 + 8 & 0 + 10 \\ 4 & 5 \end{bmatrix} = \begin{bmatrix} 11 & 10 \\ 4 & 5 \end{bmatrix} \]Next, calculate \( \mathbf{BA} \):
\[ \mathbf{BA} = \begin{bmatrix} 3 \cdot 1 + 0 \cdot 0 & 3 \cdot 2 + 0 \cdot 1 \\ 4 \cdot 1 + 5 \cdot 0 & 4 \cdot 2 + 5 \cdot 1 \end{bmatrix} = \begin{bmatrix} 3 & 6 \\ 4 & 8 + 5 \end{bmatrix} = \begin{bmatrix} 3 & 6 \\ 4 & 13 \end{bmatrix} \]Here, \( \mathbf{AB} \) and \( \mathbf{BA} \) are both \( 2 \times 2 \) matrices, but:
\[ \mathbf{AB} = \begin{bmatrix} 11 & 10 \\ 4 & 5 \end{bmatrix}, \quad \mathbf{BA} = \begin{bmatrix} 3 & 6 \\ 4 & 13 \end{bmatrix} \]Since \( \mathbf{AB} \neq \mathbf{BA} \), this example illustrates that even when matrices are square and conformable, matrix multiplication is generally not commutative.
Associativity of Matrix Multiplication
Matrix multiplication is associative, meaning that the grouping of matrices during multiplication does not change the final result. Specifically, for any three conformable matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \), we have:
Proof Outline:
Let \( \mathbf{A} \) be an \( m \times p \) matrix, \( \mathbf{B} \) a \( p \times q \) matrix, and \( \mathbf{C} \) a \( q \times n \) matrix. The product \( \mathbf{A} \mathbf{B} \) results in an \( m \times q \) matrix, and multiplying this by \( \mathbf{C} \) yields an \( m \times n \) matrix. Similarly, the product \( \mathbf{B} \mathbf{C} \) is a \( p \times n \) matrix, and multiplying this by \( \mathbf{A} \) also results in an \( m \times n \) matrix. Since both sides result in matrices of the same dimensions, we proceed to show that their elements are equal.
Consider the element in the \( i \)-th row and \( j \)-th column of the matrix product \( \mathbf{A}(\mathbf{B}\mathbf{C}) \):
Next, express \( (\mathbf{B}\mathbf{C})_{kj} \) as the sum of products of elements from \( \mathbf{B} \) and \( \mathbf{C} \):
Substitute this into the original sum:
This double sum can be written as:
Now, consider the element in the \( i \)-th row and \( j \)-th column of the matrix product \( (\mathbf{A}\mathbf{B})\mathbf{C} \):
Express \( (\mathbf{A}\mathbf{B})_{ir} \) as:
Substitute this into the sum:
This can also be written as:
Since both \( [\mathbf{A}(\mathbf{B}\mathbf{C})]_{ij} \) and \( [(\mathbf{A}\mathbf{B})\mathbf{C}]_{ij} \) are expressed as the same double sum:
Thus, the matrix \( \mathbf{A}(\mathbf{B}\mathbf{C}) \) is equal to the matrix \( (\mathbf{A}\mathbf{B})\mathbf{C} \), proving that matrix multiplication is associative.
Distributivity of Matrix Multiplication
Matrix multiplication is distributive over matrix addition, which means that multiplying a matrix by a sum of matrices gives the same result as multiplying each matrix individually and then adding the results. This distributive property can be expressed in two forms: left distributivity and right distributivity.
1. Left Distributivity
For matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \), where \( \mathbf{A} \) is conformable with both \( \mathbf{B} \) and \( \mathbf{C} \), the left distributive property states:
Proof
Let \( \mathbf{A} \) be an \( m \times p \) matrix, and let \( \mathbf{B} \) and \( \mathbf{C} \) both be \( p \times n \) matrices. Then the sum \( \mathbf{B} + \mathbf{C} \) is also a \( p \times n \) matrix, and \( \mathbf{A}(\mathbf{B} + \mathbf{C}) \) is an \( m \times n \) matrix.
Consider the \( (i,j) \)-th element of \( \mathbf{A}(\mathbf{B} + \mathbf{C}) \):
This can be split into two sums:
Thus:
This shows that:
2. Right Distributivity
For matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \), where \( \mathbf{B} \) and \( \mathbf{C} \) are conformable with \( \mathbf{A} \), the right distributive property states:
Proof
Let \( \mathbf{A} \) and \( \mathbf{B} \) be \( m \times p \) matrices, and let \( \mathbf{C} \) be a \( p \times n \) matrix. Then the sum \( \mathbf{A} + \mathbf{B} \) is an \( m \times p \) matrix, and \( (\mathbf{A} + \mathbf{B})\mathbf{C} \) is an \( m \times n \) matrix.
Consider the \( (i,j) \)-th element of \( (\mathbf{A} + \mathbf{B})\mathbf{C} \):
This can be split into two sums:
Thus:
This shows that:
Multiplicative Identity in Matrix Multiplication
In matrix algebra, the multiplicative identity is a matrix that, when multiplied by another matrix, leaves the original matrix unchanged. For a matrix \( \mathbf{A} \) of dimensions \( m \times n \), there are two identity matrices that serve as the left and right identities: \( \mathbf{I}_m \) and \( \mathbf{I}_n \).
Left Identity \( \mathbf{I}_m \) and Right Identity \( \mathbf{I}_n \)
-
Left Identity: The matrix \( \mathbf{I}_m \) is the \( m \times m \) identity matrix, which, when multiplied on the left side of \( \mathbf{A} \), results in \( \mathbf{A} \):
\[ \mathbf{I}_m \mathbf{A} = \mathbf{A} \] -
Right Identity: The matrix \( \mathbf{I}_n \) is the \( n \times n \) identity matrix, which, when multiplied on the right side of \( \mathbf{A} \), results in \( \mathbf{A} \):
\[ \mathbf{A} \mathbf{I}_n = \mathbf{A} \]
Example: \( \mathbf{A} \) is a \( 2 \times 3 \) Matrix
Consider a matrix \( \mathbf{A} \) with dimensions \( 2 \times 3 \):
- Left Identity \( \mathbf{I}_2 \):
The left identity matrix \( \mathbf{I}_2 \) is a \( 2 \times 2 \) identity matrix:
Multiplying \( \mathbf{I}_2 \) on the left of \( \mathbf{A} \):
- Right Identity \( \mathbf{I}_3 \):
The right identity matrix \( \mathbf{I}_3 \) is a \( 3 \times 3 \) identity matrix:
Multiplying \( \mathbf{I}_3 \) on the right of \( \mathbf{A} \):
Thus, \( \mathbf{I}_2 \mathbf{A} = \mathbf{A} \) and \( \mathbf{A} \mathbf{I}_3 = \mathbf{A} \), confirming the identity property for non-square matrices.
Square Matrix Case
For a square matrix \( \mathbf{A} \) of size \( n \times n \), there is a single identity matrix \( \mathbf{I}_n \) such that:
Example:
Consider \( \mathbf{A} = \begin{bmatrix} 2 & 3 \\ 4 & 5 \end{bmatrix} \), a \( 2 \times 2 \) matrix, and the identity matrix \( \mathbf{I}_2 \):
Then:
Multiplicative Inverse of a Matrix
The multiplicative inverse of a matrix is a matrix that, when multiplied by the original matrix, results in the identity matrix. The multiplicative inverse exists only for square matrices (matrices with the same number of rows and columns) and only under certain conditions, which we will explore later.
For a square matrix \( \mathbf{A} \), the inverse is denoted by \( \mathbf{A}^{-1} \). The matrix \( \mathbf{A}^{-1} \) is defined by the property:
where \( \mathbf{I} \) is the identity matrix of the same dimension as \( \mathbf{A} \).
This means that when you multiply \( \mathbf{A} \) by its inverse \( \mathbf{A}^{-1} \), whether on the left or the right, the result is the identity matrix. We will study more about it in later section.
Powers of a Matrix
For a square matrix \( \mathbf{X} \) and a natural number \( n \in \mathbb{N} \), the power \( \mathbf{X}^n \) is defined as the matrix \( \mathbf{X} \) multiplied by itself \( n \) times:
This operation is valid only for square matrices, where the number of rows and columns are equal.
Properties of Matrix Powers
-
Multiplication of Powers:
For any natural numbers \( m \) and \( n \):
\[ \mathbf{X}^m \mathbf{X}^n = \mathbf{X}^{m+n} \]This property follows from the definition of matrix multiplication, where multiplying \( \mathbf{X}^m \) by \( \mathbf{X}^n \) results in a matrix that has \( m + n \) factors of \( \mathbf{X} \).
-
Power of a Power:
For any natural numbers \( m \) and \( n \):
\[ (\mathbf{X}^m)^n = \mathbf{X}^{mn} \]This means that taking the \( n \)-th power of \( \mathbf{X}^m \) is equivalent to multiplying the matrix \( \mathbf{X} \) by itself \( mn \) times.
These properties are analogous to the properties of exponents for real numbers, but they hold within the context of matrix multiplication, which requires \( \mathbf{X} \) to be a square matrix.
Expanding Powers of Matrix Sums
Given two square matrices \( \mathbf{A} \) and \( \mathbf{B} \), the expansion of powers of their sum or difference follows the rules of matrix multiplication, keeping in mind that, in general, \( \mathbf{A} \mathbf{B} \neq \mathbf{B} \mathbf{A} \). Here’s how these expansions work:
1. Expansion of \( (\mathbf{A} + \mathbf{B})^2 \)
Expanding \( (\mathbf{A} + \mathbf{B})^2 \) gives:
Expanding this product using the distributive property:
Since \( \mathbf{A} \mathbf{B} \) and \( \mathbf{B} \mathbf{A} \) are generally not equal, we cannot combine or simplify these terms further. The final expression is:
2. Expansion of \( (\mathbf{A} + \mathbf{B})(\mathbf{A} - \mathbf{B}) \)
Expanding \( (\mathbf{A} + \mathbf{B})(\mathbf{A} - \mathbf{B}) \) gives:
Again, because \( \mathbf{A} \mathbf{B} \) and \( \mathbf{B} \mathbf{A} \) are not necessarily equal, this expression cannot be simplified further.
3. Expansion of \( (\mathbf{A} + \mathbf{B})^3 \)
To expand \( (\mathbf{A} + \mathbf{B})^3 \), we proceed similarly:
Using the expansion from \( (\mathbf{A} + \mathbf{B})^2 \):
Now distribute \( (\mathbf{A} + \mathbf{B}) \):
This expanded form cannot be simplified further due to the non-commutativity of matrix multiplication.
Summary
When working with powers of sums or differences of matrices, the lack of commutativity (i.e., \( \mathbf{A} \mathbf{B} \neq \mathbf{B} \mathbf{A} \)) prevents us from simplifying the expressions as we might with real numbers. Instead, we must expand and maintain all terms, acknowledging that products like \( \mathbf{A} \mathbf{B} \) and \( \mathbf{B} \mathbf{A} \) are distinct and cannot be combined.
Commutativity of Matrices
Two matrices \( \mathbf{A} \) and \( \mathbf{B} \) are said to be commutative if their product does not depend on the order of multiplication, i.e., if:
This property is quite special because, in general, matrix multiplication is not commutative.
When Matrices \( \mathbf{A} \) and \( \mathbf{B} \) Are Not Commutative
If \( \mathbf{A} \) and \( \mathbf{B} \) are not commutative (i.e., \( \mathbf{A} \mathbf{B} \neq \mathbf{B} \mathbf{A} \)), then when you square the product \( \mathbf{AB} \), you must explicitly multiply as follows:
In this case, you cannot simplify \( \mathbf{A} \mathbf{B} \mathbf{A} \mathbf{B} \) further because \( \mathbf{A} \) and \( \mathbf{B} \) do not commute.
When Matrices \( \mathbf{A} \) and \( \mathbf{B} \) Are Commutative
If \( \mathbf{A} \) and \( \mathbf{B} \) are commutative (i.e., \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \)), the product \( (\mathbf{AB})^2 \) simplifies similarly to the way powers work for real numbers:
Given that \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \), you can rewrite this as:
Thus, when \( \mathbf{A} \) and \( \mathbf{B} \) commute, the square of the product \( \mathbf{AB} \) simplifies to \( \mathbf{A}^2 \mathbf{B}^2 \), just like the multiplication of real numbers.
When two matrices \( \mathbf{A} \) and \( \mathbf{B} \) are commutative, i.e., \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \), for any natural number \( n \), the following holds:
Proof
Given that \( \mathbf{A} \) and \( \mathbf{B} \) are commutative matrices, i.e., \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \), we want to prove that \( \mathbf{A} \mathbf{B}^n = \mathbf{B}^n \mathbf{A} \) for any natural number \( n \).
We start by expanding \( \mathbf{A} \mathbf{B}^n \):
We can proceed by moving the matrix \( \mathbf{A} \) through each \( \mathbf{B} \), using the commutative property \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \):
-
First, move \( \mathbf{A} \) past the first \( \mathbf{B} \):
\[ \mathbf{A} \mathbf{B}^n = (\mathbf{A} \mathbf{B}) \mathbf{B}^{n-1} = (\mathbf{B} \mathbf{A}) \mathbf{B}^{n-1} = \mathbf{B} \mathbf{A} \mathbf{B}^{n-1} \] -
Next, move \( \mathbf{A} \) past the second \( \mathbf{B} \):
\[ \mathbf{B} \mathbf{A} \mathbf{B}^{n-1} = \mathbf{B} (\mathbf{A} \mathbf{B}) \mathbf{B}^{n-2} = \mathbf{B} (\mathbf{B} \mathbf{A}) \mathbf{B}^{n-2} = \mathbf{B}^2 \mathbf{A} \mathbf{B}^{n-2} \] -
Continue this pattern, moving \( \mathbf{A} \) past each remaining \( \mathbf{B} \):
\[ \mathbf{B}^2 \mathbf{A} \mathbf{B}^{n-2} = \mathbf{B}^3 \mathbf{A} \mathbf{B}^{n-3} = \cdots = \mathbf{B}^{n-1} \mathbf{A} \mathbf{B} = \mathbf{B}^n \mathbf{A} \]
Thus, we have shown that:
This proves the property without using mathematical induction, simply by following the pattern of moving \( \mathbf{A} \) through each \( \mathbf{B} \).
Proof Using Mathematical Induction:
If you're familiar with mathematical induction, here’s a formal proof using that method:
Base Case:
For \( n = 1 \), we have:
Given that \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \), the base case holds:
Inductive Step:
Assume that \( \mathbf{A} \mathbf{B}^k = \mathbf{B}^k \mathbf{A} \) holds for some \( k \geq 1 \). We need to show that:
Starting with the left-hand side:
Using the commutative property \( \mathbf{A} \mathbf{B} = \mathbf{B} \mathbf{A} \) and the inductive hypothesis \( \mathbf{A} \mathbf{B}^k = \mathbf{B}^k \mathbf{A} \), we get:
Thus, by the principle of mathematical induction, we conclude that for all \( n \geq 1 \):
This concludes the proof using mathematical induction.
More of these properties
For commutative matrices \( \mathbf{A} \) and \( \mathbf{B} \):
-
\[ (\mathbf{A}\mathbf{B})^n = \mathbf{A}^n \mathbf{B}^n \]
-
\[ (\mathbf{A} + \mathbf{B})^2 = \mathbf{A}^2 + 2\mathbf{A}\mathbf{B} + \mathbf{B}^2 \]
-
\[ (\mathbf{A} - \mathbf{B})^2 = \mathbf{A}^2 - 2\mathbf{A}\mathbf{B} + \mathbf{B}^2 \]
-
\[ (\mathbf{A} + \mathbf{B})^3 = \mathbf{A}^3 + 3\mathbf{A}^2\mathbf{B} + 3\mathbf{A}\mathbf{B}^2 + \mathbf{B}^3 \]
-
\[ (\mathbf{A} + \mathbf{B})(\mathbf{A} - \mathbf{B}) = \mathbf{A}^2 - \mathbf{B}^2 \]
Binomial Theorem for Matrices:
The Binomial Theorem for matrices is an extension of the binomial expansion for real numbers. For any square matrices \( \mathbf{A} \) and \( \mathbf{B} \) that commute:
Here, \( \binom{n}{k} \) denotes the binomial coefficient, and the theorem expands \( (\mathbf{A} + \mathbf{B})^n \) into a sum of terms, each involving products of powers of \( \mathbf{A} \) and \( \mathbf{B} \).
Example: For \( n = 3 \), the expansion is:
Simplifying with the binomial coefficients:
Examples of Commutative Matrices
A classic example of commutative matrices involves the identity matrix \( \mathbf{I} \) and any square matrix \( \mathbf{A} \). The identity matrix \( \mathbf{I} \) commutes with any square matrix \( \mathbf{A} \), meaning that \( \mathbf{A} \mathbf{I} = \mathbf{I} \mathbf{A} = \mathbf{A} \). This commutative relationship allows us to apply algebraic identities similar to those used for real numbers.
For instance, consider the expression \( (\mathbf{I} + \mathbf{A})^2 \). We can expand this using the distributive property, keeping in mind that \( \mathbf{A} \) commutes with \( \mathbf{I} \):
Since \( \mathbf{I}^2 = \mathbf{I} \), and \( \mathbf{I}\mathbf{A} = \mathbf{A}\mathbf{I} = \mathbf{A} \), this simplifies to:
Similarly, for the cube of the sum, \( (\mathbf{I} + \mathbf{A})^3 \), we expand:
Substituting the earlier expansion:
Expanding further, we get:
Another useful identity involving commutative matrices is \( \mathbf{I} - \mathbf{A}^2 = (\mathbf{I} + \mathbf{A})(\mathbf{I} - \mathbf{A}) \). This identity holds because:
Finally, we can generalize these results using the binomial theorem for matrices. For any natural number \( n \), the binomial expansion of \( (\mathbf{I} + \mathbf{A})^n \) is:
Given that \( \mathbf{I}^m = \mathbf{I} \) for any positive integer \( m \), this simplifies to:
Explicitly, the expansion is:
We will see many other matrices later on that commute with each other.
Transpose of a Matrix
The transpose of a matrix is an operation that involves interchanging the rows and columns of the matrix. If you have a matrix \( \mathbf{A} \), its transpose is denoted by \( \mathbf{A}^\top \) or \( \mathbf{A}^T \).
Definition:
Given a matrix \( \mathbf{A} = [a_{ij}] \) of dimensions \( m \times n \) (where \( m \) is the number of rows and \( n \) is the number of columns), the transpose of \( \mathbf{A} \), denoted as \( \mathbf{A}^\top \), is an \( n \times m \) matrix obtained by interchanging the rows and columns of \( \mathbf{A} \). The elements of \( \mathbf{A}^\top \) are given by:
This means that the element in the \( i \)-th row and \( j \)-th column of \( \mathbf{A} \) becomes the element in the \( j \)-th row and \( i \)-th column of \( \mathbf{A}^\top \). In other words, the \( (i,j) \)-th element of \( \mathbf{A}^\top \) corresponds to the \( (j,i) \)-th element of \( \mathbf{A} \).
Example:
Consider the matrix \( \mathbf{A} \):
This is a \( 2 \times 3 \) matrix (2 rows, 3 columns). The transpose of \( \mathbf{A} \), denoted \( \mathbf{A}^\top \), is:
In this example, the element \( (1,2) \) in \( \mathbf{A} \), which is \( a_{12} = 2 \), becomes the element \( (2,1) \) in \( \mathbf{A}^\top \), and the element \( (2,3) \) in \( \mathbf{A} \), which is \( a_{23} = 6 \), becomes the element \( (3,2) \) in \( \mathbf{A}^\top \). This operation demonstrates how rows and columns are interchanged to produce the transpose of the matrix.
Properties of the Transpose of a Matrix
The transpose of a matrix has several important properties that are frequently used in linear algebra. Below are some of the key properties:
1. Double Transpose:
The double transpose of a matrix returns the original matrix. This means that if you transpose a matrix twice, you end up with the same matrix you started with:
2. Transpose of a Scalar Multiple:
When a matrix is multiplied by a scalar \( k \), the transpose of this product is equal to the scalar multiplied by the transpose of the matrix:
3. Transpose of a Sum:
The transpose of the sum of two matrices is equal to the sum of their transposes. This property ensures that transposition distributes over matrix addition:
4. Transpose of a Product:
The transpose of the product of two matrices reverses the order of multiplication. This means that when transposing a product of matrices, the order of the matrices must be reversed:
Proof of the Transpose Reversal Law:
Given matrices \( \mathbf{A} \) of dimensions \( m \times p \) and \( \mathbf{B} \) of dimensions \( p \times n \), we need to prove that:
Step 1: Prove Both Sides Have the Same Dimensions
- The product \( \mathbf{A} \mathbf{B} \) results in a matrix of dimensions \( m \times n \).
- The transpose \( (\mathbf{A} \mathbf{B})^\top \) therefore has dimensions \( n \times m \).
On the right-hand side:
- \( \mathbf{B}^\top \) has dimensions \( n \times p \).
- \( \mathbf{A}^\top \) has dimensions \( p \times m \).
- The product \( \mathbf{B}^\top \mathbf{A}^\top \) has dimensions \( n \times m \).
Thus, both \( (\mathbf{A} \mathbf{B})^\top \) and \( \mathbf{B}^\top \mathbf{A}^\top \) have the same dimensions \( n \times m \).
Step 2: Prove Each Element of the Left Side Equals the Corresponding Element of the Right Side
Now, let's prove that each element of the left side equals the corresponding element of the right side.
Start with the left side \( ((\mathbf{A} \mathbf{B})^\top)_{ij} \):
Next, expand \( (\mathbf{A} \mathbf{B})_{ji} \) using the definition of matrix multiplication:
Now, let's express the terms on the right side in terms of the transposed matrices:
By interchange positions of two terms, we have:
Therefore, by definition of matrix multiplication:
Since the elements are equal for all \( i \) and \( j \), we conclude that:
5. Transpose Reversal Law for the Product of Three Matrices
The reversal law for the transpose is applicable to the product of three matrices. If \( \mathbf{A} \) is an \( m \times p \) matrix, \( \mathbf{B} \) is a \( p \times q \) matrix, and \( \mathbf{C} \) is a \( q \times n \) matrix, then:
Explanation:
We can prove this step by step:
Next, apply the reversal law to the product \( (\mathbf{A} \mathbf{B})^\top \):
Thus, the transpose of the product of three matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \) is the product of their transposes in reverse order.
6. Transpose Reversal Law for the Product of Multiple Matrices
The reversal law can be generalized to the product of any number of matrices. If \( \mathbf{A}_1, \mathbf{A}_2, \ldots, \mathbf{A}_n \) are matrices, then:
This property shows that when taking the transpose of a product of multiple matrices, you reverse the order of the matrices and take the transpose of each one.
7. Transpose of a Power of a Matrix
The transpose of a power of a matrix is equal to the power of the transpose of the matrix. Specifically, for a square matrix \( \mathbf{A} \) and a natural number \( n \):
This means that if you raise a matrix to a power and then transpose the result, it's the same as transposing the matrix first and then raising it to that power. To prove this use put take \(A_1 = A_2 = ... =A_n = A\) in the previous property.
Example
Let's consider the matrix expression \( (\mathbf{A} + \mathbf{A}\mathbf{B}^\top)^\top \). We want to find the transpose of this expression.
Starting with the matrix expression \( (\mathbf{A} + \mathbf{A}\mathbf{B}^\top)^\top \), we can expand and simplify as follows:
Thus:
Trace of a Matrix
The trace of a square matrix is defined as the sum of its diagonal elements. If \( \mathbf{A} = [a_{ij}] \) is an \( n \times n \) square matrix, then the trace of \( \mathbf{A} \), denoted by \( \text{trace}(\mathbf{A}) \) or \( \text{tr}(\mathbf{A}) \), is given by:
This means that the trace is the sum of all elements \( a_{ii} \) where the row index \( i \) is equal to the column index \( i \).
In other words, for a matrix \( \mathbf{A} = [a_{ij}]_{n \times n} \), the trace is calculated as:
The trace is only defined for square matrices, as it specifically involves the sum of elements along the main diagonal.
Properties of the Trace of a Matrix
The trace of a matrix has several important properties:
-
Trace of the Identity Matrix:
\[ \text{trace}(\mathbf{I}_n) = n \]The trace of the identity matrix \( \mathbf{I}_n \) of order \( n \) is equal to \( n \), since all the diagonal elements are 1.
-
Trace of a Transpose:
\[ \text{trace}(\mathbf{A}^\top) = \text{trace}(\mathbf{A}) \]The trace of a matrix is equal to the trace of its transpose, as the diagonal elements remain the same.
-
Trace of a Scalar Multiple:
\[ \text{trace}(k\mathbf{A}) = k \, \text{trace}(\mathbf{A}) \]The trace of a scalar multiple of a matrix is the scalar multiplied by the trace of the matrix.
-
Trace of a Sum:
\[ \text{trace}(\mathbf{A} + \mathbf{B}) = \text{trace}(\mathbf{A}) + \text{trace}(\mathbf{B}) \]The trace of the sum of two matrices is equal to the sum of their traces.
-
Trace of a Product:
If \( \mathbf{A} \) is \( m \times n \) and \( \mathbf{B} \) is \( n \times m \), then both \( \mathbf{AB} \) and \( \mathbf{BA} \) exist and have dimensions \( m \times m \) and \( n \times n \), respectively. In that case, \( \text{trace}(\mathbf{AB}) = \text{trace}(\mathbf{BA}) \).
Proof:
\[ \text{trace}(\mathbf{AB}) = \sum_{i=1}^{m} (\mathbf{AB})_{ii} \]\[ = \sum_{i=1}^{m} \sum_{j=1}^{n} (\mathbf{A})_{ij} (\mathbf{B})_{ji} \]\[ = \sum_{j=1}^{n} \sum_{i=1}^{m} (\mathbf{B})_{ji} (\mathbf{A})_{ij} \]\[ = \sum_{j=1}^{n} (\mathbf{BA})_{jj} \]\[ = \text{trace}(\mathbf{BA}) \]Example
Let's consider an example where \( \mathbf{A} \) is a \( 2 \times 3 \) matrix and \( \mathbf{B} \) is a \( 3 \times 2 \) matrix. We want to demonstrate the property that \( \text{trace}(\mathbf{A} \mathbf{B}) = \text{trace}(\mathbf{B} \mathbf{A}) \).
\[ \mathbf{A} = \begin{bmatrix} 1 & 2 & 3 \\ 4 & 5 & 6 \end{bmatrix}, \quad \mathbf{B} = \begin{bmatrix} 7 & 8 \\ 9 & 10 \\ 11 & 12 \end{bmatrix} \]The product \( \mathbf{A} \mathbf{B} \) will be a \( 2 \times 2 \) matrix:
\[ \mathbf{A} \mathbf{B} = \begin{bmatrix} 1 \cdot 7 + 2 \cdot 9 + 3 \cdot 11 & 1 \cdot 8 + 2 \cdot 10 + 3 \cdot 12 \\ 4 \cdot 7 + 5 \cdot 9 + 6 \cdot 11 & 4 \cdot 8 + 5 \cdot 10 + 6 \cdot 12 \end{bmatrix} \]\[ \mathbf{A} \mathbf{B} = \begin{bmatrix} 58 & 64 \\ 139 & 154 \end{bmatrix} \]The product \( \mathbf{B} \mathbf{A} \) will be a \( 3 \times 3 \) matrix:
\[ \mathbf{B} \mathbf{A} = \begin{bmatrix} 7 \cdot 1 + 8 \cdot 4 & 7 \cdot 2 + 8 \cdot 5 & 7 \cdot 3 + 8 \cdot 6 \\ 9 \cdot 1 + 10 \cdot 4 & 9 \cdot 2 + 10 \cdot 5 & 9 \cdot 3 + 10 \cdot 6 \\ 11 \cdot 1 + 12 \cdot 4 & 11 \cdot 2 + 12 \cdot 5 & 11 \cdot 3 + 12 \cdot 6 \end{bmatrix} \]\[ \mathbf{B} \mathbf{A} = \begin{bmatrix} 39 & 54 & 69 \\ 49 & 68 & 87 \\ 59 & 82 & 105 \end{bmatrix} \]Now let us find the trace of each.
-
Trace of \( \mathbf{A} \mathbf{B} \):
\[ \text{trace}(\mathbf{A} \mathbf{B}) = 58 + 154 = 212 \] -
Trace of \( \mathbf{B} \mathbf{A} \):
\[ \text{trace}(\mathbf{B} \mathbf{A}) = 39 + 68 + 105 = 212 \]
Hence,
Both traces are equal, showing that:
\[ \text{trace}(\mathbf{A} \mathbf{B}) = \text{trace}(\mathbf{B} \mathbf{A}) = 212 \] -
-
Cyclic Property of the Trace
For matrices \( \mathbf{A} \) of dimensions \( m \times p \), \( \mathbf{B} \) of dimensions \( p \times p \), and \( \mathbf{C} \) of dimensions \( p \times n \), the trace function exhibits a cyclic permutation property:
\[ \text{trace}(\mathbf{A} \mathbf{B} \mathbf{C}) = \text{trace}(\mathbf{C} \mathbf{A} \mathbf{B}) = \text{trace}(\mathbf{B} \mathbf{C} \mathbf{A}) \]To prove this property, we use the fact that \( \text{trace}(\mathbf{XY}) = \text{trace}(\mathbf{YX}) \).
Starting with \( \text{trace}(\mathbf{A} \mathbf{B} \mathbf{C}) \), we can group the first two matrices to write it as \( \text{trace}((\mathbf{A} \mathbf{B}) \mathbf{C}) \). Applying the property \( \text{trace}(\mathbf{XY}) = \text{trace}(\mathbf{YX}) \), this becomes \( \text{trace}(\mathbf{C} (\mathbf{A} \mathbf{B})) \). By the associative property of matrix multiplication, this simplifies to \( \text{trace}(\mathbf{C} \mathbf{A} \mathbf{B}) \).
Similarly, we can cyclically permute the matrices further:
\[ \text{trace}(\mathbf{C} \mathbf{A} \mathbf{B}) = \text{trace}((\mathbf{C} \mathbf{A}) \mathbf{B}) = \text{trace}(\mathbf{B} (\mathbf{C} \mathbf{A})) = \text{trace}(\mathbf{B} \mathbf{C} \mathbf{A}) \]Thus, we have shown that:
\[ \text{trace}(\mathbf{A} \mathbf{B} \mathbf{C}) = \text{trace}(\mathbf{C} \mathbf{A} \mathbf{B}) = \text{trace}(\mathbf{B} \mathbf{C} \mathbf{A}) \]This proves that the trace of the product of matrices \( \mathbf{A} \), \( \mathbf{B} \), and \( \mathbf{C} \) remains invariant under cyclic permutations.
-
Cyclic Property of the Trace for Multiple Matrices
The cyclic property of the trace extends to any number of matrices. For matrices \( \mathbf{A}_1, \mathbf{A}_2, \ldots, \mathbf{A}_n \) where \( \mathbf{A}_1 \) is \( m \times p_1 \), \( \mathbf{A}_2 \) is \( p_1 \times p_2 \), \( \mathbf{A}_3 \) is \( p_2 \times p_3 \), and so on, up to \( \mathbf{A}_n \) being \( p_{n-1} \times m \), the trace of their product has the following property:
\[ \text{trace}(\mathbf{A}_1 \mathbf{A}_2 \cdots \mathbf{A}_n) = \text{trace}(\mathbf{A}_n \mathbf{A}_1 \mathbf{A}_2 \cdots \mathbf{A}_{n-1}) = \text{trace}(\mathbf{A}_{n-1} \mathbf{A}_n \mathbf{A}_1 \mathbf{A}_2 \cdots \mathbf{A}_{n-2}) = \cdots \] -
Trace of \( \mathbf{A}\mathbf{A}^\top \) and \( \mathbf{A}^\top\mathbf{A} \)
For any matrix \( \mathbf{A} \) of dimensions \( m \times n \), the trace of the products \( \mathbf{A} \mathbf{A}^\top \) and \( \mathbf{A}^\top \mathbf{A} \) is equal to the sum of the squares of all elements of \( \mathbf{A} \):
\[ \text{trace}(\mathbf{A} \mathbf{A}^\top) = \text{trace}(\mathbf{A}^\top \mathbf{A}) = \sum_{i=1}^{m} \sum_{j=1}^{n} a_{ij}^2 \]Proof:
Given \( \mathbf{A} \) is \( m \times n \), the transpose \( \mathbf{A}^\top \) is \( n \times m \). Therefore, the matrices \( \mathbf{A}\mathbf{A}^\top \) and \( \mathbf{A}^\top\mathbf{A} \) are both square, with dimensions \( m \times m \) and \( n \times n \) respectively. Since both are square matrices, their traces are defined.
Using the property of trace that \( \text{trace}(\mathbf{XY}) = \text{trace}(\mathbf{YX}) \), we can establish that:
\[ \text{trace}(\mathbf{A}\mathbf{A}^\top) = \text{trace}(\mathbf{A}^\top \mathbf{A}). \]To understand why these traces are also the sum of the squares of all elements in \( \mathbf{A} \), consider the definition of the trace of \( \mathbf{A} \mathbf{A}^\top \):
\[ \text{trace}(\mathbf{A} \mathbf{A}^\top) = \sum_{i=1}^{m} (\mathbf{A} \mathbf{A}^\top)_{ii} \]\[ = \sum_{i=1}^{m} \sum_{j=1}^{n} (\mathbf{A})_{ij} (\mathbf{A}^\top)_{ji} \]\[ = \sum_{i=1}^{m} \sum_{j=1}^{n} (\mathbf{A})_{ij} (\mathbf{A})_{ij} \]\[ = \sum_{i=1}^{m} \sum_{j=1}^{n} (\mathbf{A}_{ij})^2 \]