Sigma Notation
Introduction
Sigma notation is used to write sums of many terms in a compact way.
It uses Greek letter sigma, which is why it is called sigma-notation.
Here, \(i\) is called the running index. The running index in this case varies through integer values from lower limit 1 to the upper limit \(n\). \(a_i\) is called the summand.
In general, given a sequence \( \{ a_k \}_{k=l}^{u} \), then the summation from over \(k\) from \(l\) to \(u\) is
The above form of the summation is called as delimited form.
General Sigma Notation
General sigma notation is a way to write sigma notation in the form \(\sum_{P(k)} a_k\), where instead of giving the limits we mention the property \(P(k)\) which is satisfied by integers and we take the summation over those integers only for which \(P(k)\) is true.
e.g. \(\sum_{1 \leq k \leq n} a_k\), means the same thing as \(\sum_{i=1}^{n} a_i\),
which in generalized form can be written as \(\sum_{\substack{1 \leq k \leq 19 \\ k \text{ is odd}}} (a_k)\),
Sometimes the generalized form is much better than the delimited form.
Expanding a summation
Expanding the summation notation means expressing the compact form of a sum represented by the sigma symbol \( \Sigma \) into its individual terms. To do this, you follow the index of the summation from its lower limit to its upper limit, writing out each term in the sequence. Here’s how it works with a couple of examples:
To expand the summation \( \sum_{k=1}^{n} \frac{k}{k+1} \), you would write out the terms of the sequence for each value of \( k \) from 1 to \( n \). Here is how the expansion looks:
Simplifying each fraction, you get:
This gives you a clear picture of the series of terms that are being summed.
Well, that is easy.
Condensing a Series in Sigma Notation
This is the opposite on expansion. Sometimes you are given a series in expanded form and you need to write in summation form. If the general term, like nth term, rth term, etc is known we can easily write in sigma form. But when the general term is not given, we may have to identify the general term by observing some patterns. It can be understood via some examples.
Example
Consider the series \(S_n\) given by the expression:
To express this series using sigma notation, let's examine its structure. Each term is the product of two consecutive integers, with the first factor being the position of the term in the series and the second factor being one more than the first. This observation allows us to define the general term of the series as \(t_k = k(k + 1)\), for the \(k\)-th term.
Therefore, the sum of the first \(n\) terms of the series can be represented compactly as:
Example
Now consider the series \(S_n = \frac{1}{1 \cdot 2} + \frac{3}{2 \cdot 3} + \frac{5}{3 \cdot 4} + \ldots\) up to \(n\) terms. To condense it in sigma notation, we need to identify the pattern of the terms.
In each term of this series, the numerator is an odd number, and the denominator is the product of consecutive integers. Specifically, the numerator starts at 1 and increases by 2 with each term (1, 3, 5, ...), and the denominator is the product of \(k\) and \(k+1\) where \(k\) starts at 1.
The odd numbers in the numerator can be represented by the formula \(2k-1\), which generates the sequence of odd numbers as \(k\) increases (when \(k=1\), \(2k-1=1\); when \(k=2\), \(2k-1=3\); and so on).
Thus, the \(k\)-th term in the series can be written as:
Now we can express \(S_n\) using sigma notation, which sums the general term from \(k = 1\) to \(k = n\):
This notation succinctly encapsulates the entire series, showing that for each integer \(k\) from 1 to \(n\), we take the odd number \(2k - 1\) and divide it by the product of \(k\) and \(k + 1\).
Example
To express the series \(S_n = 1\cdot3\cdot5 - 3\cdot5\cdot7 + 5\cdot7\cdot9 - 7\cdot9\cdot11 + \ldots\) up to \(n\) terms in sigma notation, observe the pattern in the series:
- Each term consists of the product of three consecutive odd numbers.
- The signs alternate between positive and negative.
Notice the first number in each term: \(1, 3, 5, 7, \ldots\) can be represented as \(2k-1\) where \(k = 1, 2, 3, \ldots, n\). This gives us the starting odd number for each term.
Given this, the \(k\)-th term of the series can be written using the first odd number as \( (2k-1) \), the second as \( (2k+1) \), and the third as \( (2k+3) \).
However, to account for the alternating signs, we use \( (-1)^{k+1} \) as a multiplier, which alternates sign for each term in the series.
Thus, the series can be condensed into the sigma notation as:
This notation captures the pattern of alternating signs and the product of three consecutive odd numbers starting from \(1, 3, 5, \ldots\) for each term of the series up to \(n\) terms.
Properties of sigma notation:
-
Dummy variable:
\[ \sum_{i=1}^{n} a_i = \sum_{j=1}^{n} a_j \]The index of summation is a “dummy” variable. The value of summation does not change if you change the index variable. It is bound to the sigma symbol and has no existence out of it.
Proof:
The notation \(\sum_{i=1}^{n} a_i\) simply means to sum the terms \(a_1, a_2, \ldots, a_n\). The index \(i\) is an arbitrary placeholder that indicates the position of each term in the sequence; it has no value outside the context of the sigma notation. Therefore, whether we write \(i\), \(j\), or any other letter as the index, the sum remains the same:
\[ \sum_{i=1}^{n} a_i = a_1 + a_2 + \ldots + a_n = \sum_{j=1}^{n} a_j \]Since the summands \(a_i\) and \(a_j\) represent the same sequence of terms, the sum is unaffected by the choice of index variable.
-
Splitting the Summation Range:
\[ \sum_{k=1}^{n} a_k = \sum_{k=1}^{m} a_k + \sum_{k=m+1}^{n} a_k \]where \( 1 \leq m \leq n \): We can “break” the summation into two summations.
Proof:
Consider the sum of the first \(n\) terms of a sequence. We can partition this sum at an arbitrary point \(m\) where \(1 \leq m \leq n\). The sum from 1 to \(n\) is equivalent to the sum from 1 to \(m\) plus the sum from \(m+1\) to \(n\):
\[ \sum_{k=1}^{n} a_k = (a_1 + \ldots + a_m) + (a_{m+1} + \ldots + a_n) \]This can be rewritten using sigma notation for both parts:
\[ \sum_{k=1}^{n} a_k = \sum_{k=1}^{m} a_k + \sum_{k=m+1}^{n} a_k \]This demonstrates that we can "break" the summation into two summations over adjacent intervals.
-
Factoring Out Constants:
\[ \sum_{k=1}^{n} c a_k = c \sum_{k=1}^{n} a_k \]A constant which is independent of index variable can be taken outside the sigma operator.
Proof:
Let \(c\) be a constant. The sum of the sequence terms each multiplied by \(c\) from 1 to \(n\) is:
\[ \sum_{k=1}^{n} c a_k = c a_1 + c a_2 + \ldots + c a_n \]Since \(c\) is a common factor in each term, we can factor it out:
\[ \sum_{k=1}^{n} c a_k = c (a_1 + a_2 + \ldots + a_n) \]Now, applying the sigma notation to the sum of \(a_k\):
\[ \sum_{k=1}^{n} c a_k = c \sum_{k=1}^{n} a_k \]This shows that a constant factor can be moved outside the sigma notation, applying it to the sum as a whole.
-
Distributive Property
\[ \sum_{k=1}^{n} (a_k + b_k) = \sum_{k=1}^{n} a_k + \sum_{k=1}^{n} b_k \]Proof:
\[ \sum_{k=1}^{n} (a_k + b_k) = (a_1 + b_1) + (a_2 + b_2) + (a_3 + b_3) + \ldots + (a_n + b_n) \]\[ \Rightarrow \sum_{k=1}^{n} (a_k + b_k) = (a_1 + a_2 + a_3 + \ldots + a_n) + (b_1 + b_2 + b_3 + \ldots + b_n) \]\[ \Rightarrow \sum_{k=1}^{n} (a_k + b_k) = \sum_{k=1}^{n} a_k + \sum_{k=1}^{n} b_k \] -
Substitution of index variable:
\[ \sum_{k=1}^{n} a_k = \sum_{k=m}^{n+m-1} a_{k-m+1} \]Many a times substitution of index variable may simplify the summation and it becomes easier to do the problem.
Let there be a summation \( S = \sum_{k=1}^{n} a_k \). Suppose that for some reason we wish to change the lower limit from 1 to \( m \). Then we substitute \( k - 1 = j - m \) \(\Rightarrow k = j - m + 1 \)
We get \(\sum_{k=1}^{n} a_k = \sum_{j=m+1}^{n+m} a_{j-m+1} = \sum_{j=m}^{n+m-1} a_{j-m+1} \)
So now the lower limit is \( m \)
But since \( j \) is a dummy variable we may change it back to \( k \)
i.e. \( \sum_{k=1}^{n} a_k = \sum_{k=m}^{n+m-1} a_{k-m+1} \)
More frequently though, suppose the lower limit of summation is \( m \) and we wish to change it to 1 in the summation \( \sum_{k=m}^{n} a_k \), then we substitute \( k - m = j - 1 \) \( \Rightarrow k = m + j - 1 \)
Thus we get, \( \sum_{k=m}^{n} a_k = \sum_{m+j-1=j}^{n} a_{m+j-1} = \sum_{j=1}^{n-m+1} a_{j+m-1} \) \( = \sum_{k=1}^{n-m+1} a_{k+m-1} \) (j is a dummy variable)
-
Reversing the series:
\[ \sum_{k=0}^{n} a_k = \sum_{k=0}^{n} a_{n-k} \]Observe that LHS \( = \sum_{k=0}^{n} a_k = a_0 + a_1 + a_2 + \ldots + a_n \), and RHS \( = \sum_{k=0}^{n} a_{n-k} = a_n + a_{n-1} + \ldots + a_0 \)
The second series is opposite order. This is very helpful in calculating the sum of many series.
Observe that index \(k\) begins from \(0\). When \(k\) begins from \(1\), we have,
\[ \sum_{k=1}^{n} a_k = \sum_{k=1}^{n} a_{n-k+1} \]
Double Summation
First consider this illustrative example. Basically double summation is summation of a summation.
The double summation for the expression \( i^2 + j^2 \), where \( i \) ranges from 1 to 4 and \( j \) ranges from 1 to 3, can be written as:
In the above expression, there are two summations. Inside the bracket is the inner summation and outside is outer summation.
\(i\) is the index of the outer summation and \(j\) is the index of the outer summation.
For the inner summation \(i\) is basically a constant, only \(j\) varies.
Expanding the inner summation first by substituting the values of \( j \):
When \( j = 1 \): \( i^2 + 1^2 = i^2 + 1 \)
When \( j = 2 \): \( i^2 + 2^2 = i^2 + 4 \)
When \( j = 3 \): \( i^2 + 3^2 = i^2 + 9 \)
So, for each \( i \), the inner summation is:
Thus, we get
Now expand this summation.
The full expansion will be:
This can be represented as a matrix where the element in the \(i\)-th row and \(j\)-th column is \( i^2 + j^2 \):
This matrix represents all terms and double summation is the sum of all elements in this matrix.
Definition
Double summation is a mathematical operation that involves taking the summation of a summation, effectively summing over two indices. It is denoted by two sigma symbols, one nested inside the other, and is used to sum terms that are dependent on two different indices. For terms denoted by \( a_{ij} \), which are functions of two indices \( i \) and \( j \), the double summation is defined as:
Here, the inner summation \( \sum_{j=1}^{n} a_{ij} \) is performed first for a fixed value of \( i \), summing over the index \( j \). Once this is done for every \( i \), the outer summation \( \sum_{i=1}^{m} \) is then executed, summing the results of the inner summations across the index \( i \).
This operation can be visualized as summing the entries of an \( m \times n \) matrix, where \( m \) represents the number of rows (controlled by the outer summation index \( i \)), and \( n \) represents the number of columns (controlled by the inner summation index \( j \)). Each term \( a_{ij} \) in the matrix is summed to produce the total sum of all entries.
Notation
It is not necessary to enclose the inner summation in brackets, as we did above in the definition. The summation can be written succinctly without brackets, and it is understood that the inner summation is to be performed first. Here's the brief notation for a general double summation:
This implies that for each fixed \( i \), the sum over \( j \) is to be calculated first, and then the resulting sums are to be added together as \( i \) varies from 1 to \( m \).
The double summation is commutative
Explanation:
This mathematical statement expresses the commutative nature of double summation. It means that if you have a matrix or a two-dimensional array of numbers \( a_{ij} \), where \( i \) and \( j \) are indices running from 1 to \( m \) and 1 to \( n \) respectively, you can sum the elements in two ways. You can either add up the elements row by row (summing across \( j \) for each \( i \)) and then sum the results, or add up the elements column by column (summing across \( i \) for each \( j \)) and then sum those results. In either case, the final sum will be the same, demonstrating the commutative property of addition in this context.
Visualizing all terms as an \( m \times n \) matrix where double summation is the sum of all elements in the matrix.
If we sum column by column, i.e.,
it is equal to
which can be expressed as the double summation
Factorization of Double Summations with Separable Terms
If the general term \( a_{ij} \) can be expressed as the product of two terms \( b_i \) and \( c_j \), where \( b_i \) is solely dependent on the index \( i \) and \( c_j \) is solely dependent on the index \( j \), then the double summation of \( a_{ij} \) across the indices \( i \) and \( j \) can be simplified. This is because the summation can be separated into the product of two single summations due to the independence of \( b_i \) and \( c_j \):
Since \( b_i \) does not depend on \( j \) and \( c_j \) does not depend on \( i \), the summation can be decomposed into:
Proof:
To prove that a double summation of a product of terms that each depend on only one index can be factorized, let's start with the general term \( a_{ij} = b_i c_j \), and perform the double summation:
This can be written as:
Expanding the inner summation, we have:
Since \( b_i \) is independent of \( j \), it can be taken outside the inner summation:
Now, \( b_i \) is still inside the outer summation, so we continue to the next step. The inner sum \( \sum_{j=1}^{n} c_j \) is a constant with respect to \( i \) because it does not depend on \( i \). Therefore, it can be factored out completely, leading to:
This factorization separates the double summation into the product of two separate summations, each over a single index. The final expression represents the total sum of the matrix \( a_{ij} \) when each term is the product of a term dependent only on \( i \) and another term dependent only on \( j \).
For example:
-
\[ \sum_{i=1}^{m} \sum_{j=1}^{n} ij = \left( \sum_{i=1}^{m} i \right) \left( \sum_{j=1}^{n} j \right) \]
-
\[ \sum_{i=1}^{m} \sum_{j=1}^{n} i^2j = \left( \sum_{i=1}^{m} i^2 \right) \left( \sum_{j=1}^{n} j \right) \]
-
\[ \sum_{i=1}^{m} \sum_{j=1}^{n} i^2j^2 = \left( \sum_{i=1}^{m} i^2 \right) \left( \sum_{j=1}^{n} j^2 \right) \]
The curious case of Dependent Indices
Consider \(\sum_{i=1}^{n} \sum_{j=1}^{n} a_{ij}\). Here \( m = n \), therefore we get a square matrix. It is the sum of all elements in the square matrix.
Now consider \(\sum_{i=1}^{n} \sum_{j=i}^{n} a_{ij}\). Observe that the lower limit of the inner summation dependent on the index of the outer summation. If we expand this we get,
Expanding the summation \(\sum_{i=1}^{n} \sum_{j=i}^{n} a_{ij}\):
For \(i = 1\), the terms are:
For \(i = 2\), the terms are:
For \(i = 3\), the terms are:
Continuing this pattern until \(i = n\), where the term is:
Thus, the expanded summation is:
If you observe the above expansion carefully, the summation \( \sum_{i=1}^{n} \sum_{j=i}^{n} a_{ij} \) corresponds to summing the elements in the upper triangular portion of the matrix, including the diagonal. This means that for each row \( i \), we start summing from the element on the diagonal (\( a_{ii} \)) to the end of the row (\( a_{in} \)).
Here's a visual explanation using a matrix:
In this matrix:
- Red elements are those that are included in the summation.
- Non-colored elements are not included in the summation.
The summation starts at the diagonal element \( a_{ii} \) of each row \( i \) and includes all elements to the right of the diagonal in that row, up to and including \( a_{in} \). Therefore, the sum is the total of all red-highlighted elements, which form the upper triangular part of the matrix.
Alternative Notation
-
\[ \sum_{i=1}^{n} \sum_{j=i}^{n} a_{ij} \text{ can be written as } \sum_{1 \leq i \leq j \leq n} a_{ij} \]
In the matrix above the elements in red, the index \(i\) in always less than or euqal to index \(j\). That's why we can write the double summation alternatively as \(\sum_{1 \leq i \leq j \leq n} a_{ij} \).
\[ \begin{bmatrix} \color{red}a_{11} & \color{red}a_{12} & \color{red}a_{13} & \color{red}\cdots & \color{red}a_{1n} \\ a_{21} & \color{red}a_{22} & \color{red}a_{23} & \color{red}\cdots & \color{red}a_{2n} \\ a_{31} & a_{32} & \color{red}a_{33} & \color{red}\cdots & \color{red}a_{3n} \\ \vdots & \vdots & \vdots & \color{red}\ddots & \color{red}\vdots \\ a_{n1} & a_{n2} & a_{n3} & \color{red}\cdots & \color{red}a_{nn} \end{bmatrix} \]Similarly,
-
\[ \sum_{i=1}^{n} \sum_{j=i+1}^{n} a_{ij} = \sum_{1 \leq i < j \leq n} a_{ij} \]
Visual representation:
\[ \begin{bmatrix} a_{11} & \color{red}a_{12} & \color{red}a_{13} & \ldots & \color{red}a_{1n} \\ a_{21} & a_{22} & \color{red}a_{23} & \ldots & \color{red}a_{2n} \\ a_{31} & a_{32} & a_{33} & \ldots & \color{red}a_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & a_{n3} & \ldots & a_{nn} \end{bmatrix} \]In this matrix, the elements \( a_{ij} \) where \( i < j \) (above the main diagonal) are colored red. This visual representation corresponds to the sum:
\[ \sum_{i=1}^{n} \sum_{j=i+1}^{n} a_{ij} \]This sum includes only the red elements from the matrix.
-
\[ \sum_{j=1}^{n} \sum_{i=j}^{n} a_{ij} = \sum_{1 \leq j \leq i \leq n} a_{ij} \]
Visually, it is the sum of all elements in the lower triangle including the diagonal.
\[ \begin{bmatrix} \color{red}a_{11} & a_{12} & a_{13} & \ldots & a_{1n} \\ \color{red}a_{21} & \color{red}a_{22} & a_{23} & \ldots & a_{2n} \\ \color{red}a_{31} & \color{red}a_{32} & \color{red}a_{33} & \ldots & a_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \color{red}a_{n1} & \color{red}a_{n2} & \color{red}a_{n3} & \ldots & \color{red}a_{nn} \end{bmatrix} \] -
\[ \sum_{j=1}^{n} \sum_{i=j+1}^{n} a_{ij} = \sum_{1 \leq j < i \leq n} a_{ij} \]\[ \begin{bmatrix} a_{11} & a_{12} & a_{13} & \ldots & a_{1n} \\ \color{red}a_{21} & a_{22} & a_{23} & \ldots & a_{2n} \\ \color{red}a_{31} & \color{red}a_{32} & a_{33} & \ldots & a_{3n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \color{red}a_{n1} & \color{red}a_{n2} & \color{red}a_{n3} & \ldots & a_{nn} \end{bmatrix} \]
Excluding the main diagonal, the sum of the lower triangular matrix.
Commuting summations in case of dependent indices
We know that when indices are not dependent, the double summation is commutative. On the other hand, when indices are dependent, we can interchange summations order with the following rules. If you observe the matrix representation, these rules quite easy to understand.
-
Upper Triangular including diagonal
\[ \sum_{j=1}^{n} \sum_{i=1}^{j} a_{ij} = \sum_{i=1}^{n} \sum_{j=i}^{n} a_{ij} \] -
Upper Triangular excluding diagonal
\[ \sum_{i=1}^{n} \sum_{j=i+1}^{n} a_{ij} = \sum_{j=2}^{n} \sum_{i=1}^{j-1} a_{ij} \] -
Lower Triangular including Diagonal
\[ \sum_{j=1}^{n} \sum_{i=j}^{n} a_{ij} = \sum_{i=1}^{n} \sum_{j=1}^{i} a_{ij} \] -
Lower Triangular excluding Diagonal
\[\sum_{j=1}^{n} \sum_{i=j+1}^{n} a_{ij} =\sum_{i=2}^{n} \sum_{j=1}^{i-1} a_{ij} \]
Symmetry
When \( a_{ij} = a_{ji} \), we get a very important result that
And