Complexity and Big-O Notation¶

The complexity of a function is the relationship between the size of the input and the difficulty of running the function to completion. The size of the input is usually denoted by \(n\). However, \(n\) usually describes something more tangible, such as the length of an array. The difficulty of a problem can be measured in several ways. One suitable way to describe the difficulty of the problem is to use basic operations: additions, subtractions, multiplications, divisions, assignments, and function calls. Although each basic operation takes different amounts of time, the number of basic operations needed to complete a function is sufficiently related to the running time to be useful, and it is much easier to count.

TRY IT! Count the number of basic operations, in terms of \(n\), required for the following function to terminate.

def f(n):
    out = 0
    for i in range(n):
        for j in range(n):
            out += i*j
            
    return out

Let’s calculate the number of operations:

additions: \(n^2\), subtractions: 0, multiplications: \(n^2\), divisions: 0, assignments: \(2n^2 +1\), function calls: 0, total: \(4n^2+1\).

The number of assignments is \(2n^2 + n + 1\) because the line \(out += i*j\) is evaluated \(n^2\) times, \(j\) is assigned \(n^2\), \(i\) is assigned \(n\) times, and the line \(out = 0\) is assigned once. So, the complexity of the function \(f\) can be described as \(4n^2 + n + 1\).

A common notation for complexity is called Big-O notation. Big-O notation establishes the relationship in the growth of the number of basic operations with respect to the size of the input as the input size becomes very large. Since hardware is different on every machine, we cannot accurately calculate how long it will take to complete without also evaluating the hardware. Then that analysis is only good for that specific machine. We do not really care how long a specific set of input on a specific machine takes. Instead, we will analyze how quickly “time to completion” in terms of basic operations grows as the input size grows, because this analysis is hardware independent. As \(n\) gets large, the highest power dominates; therefore, only the highest power term is included in Big-O notation. Additionally, coefficients are not required to characterize growth, and so coefficients are also dropped. In the previous example, we counted \(4n^2 + n + 1\) basic operations to complete the function. In Big-O notation we would say that the function is \(O(n^2)\) (pronounced “O of n-squared”). We say that any algorithm with complexity \(O(nc)\) where \(c\) is some constant with respect to \(n\) is polynomial time.

TRY IT! Determine the complexity of the iterative Fibonacci function in Big-O notation.

def my_fib_iter(n):
    
    out = [1, 1]
    
    for i in range(2, n):
        out.append(out[i - 1] + out[i - 2])
        
    return out

Since the only lines of code that take more time as \(n\) grows are those in the for-loop, we can restrict our attention to the for-loop and the code block within it. The code within the for-loop does not grow with respect to \(n\) (i.e., it is constant). Therefore, the number of basic operations is \(Cn\) where \(C\) is some constant representing the number of basic operations that occur in the for-loop, and these \(C\) operations run \(n\) times. This gives a complexity of \(O(n)\) for \(my\_fib\_iter\).

Assessing the exact complexity of a function can be difficult. In these cases, it might be sufficient to give an upper bound or even an approximation of the complexity.

TRY IT! Give an upper bound on the complexity of the recursive implementation of Fibonacci. Do you think it is a good approximation of the upper bound? Do you think that recursive Fibonacci could possibly be polynomial time?

def my_fib_rec(n):
    
    if n < 2:
        out = 1
    else:
        out = my_fib_rec(n-1) + my_fib_rec(n-2)
        
    return out

As \(n\) gets large, we can say that the vast majority of function calls make two other function calls: one addition and one assignment to the output. The addition and assignment do not grow with \(n\) per function call, so we can ignore them in Big-O notation. However, the number of function calls grows approximately by \(2^n\) , and so the complexity of \(my\_fib\_rec\) is upper bound by \(O(2^n)\).

There is on-going debate whether or not \(O(2^n)\) is a good approximation for the Fibonacci function.

Since the number of recursive calls grows exponentially with \(n\), there is no way the recursive fibonacci function could be polynomial. That is, for any \(c\), there is an \(n\) such that \(my\_fib\_rec\) takes more than \(O(n^c)\) basic operations to complete. Any function that is \(O(c^n)\) for some constant c is said to be exponential time.

TRY IT! What is the complexity of the following function in Big-O notation?

def my_divide_by_two(n):
    
    out = 0
    while n > 1:
        n /= 2
        out += 1
        
    return out

Again, only the while-loop runs longer for larger \(n\) so we can restrict our attention there. Within the while-loop, there are two assignments: one division and one addition, which are both constant time with respect to \(n\). So the complexity depends only on how many times the while-loop runs.

The while-loop cuts \(n\) in half in every iteration until \(n\) is less than 1. So the number of iterations, \(I\), is the solution to the equation \(\frac{n}{2^I} = 1\). With some manipulation, this solves to \(I = \log n\), so the complexity of \(my\_divide\_by\_two\) is \(O(log n)\). It does not matter what the base of the log is because, recalling log rules, all logs are a scalar multiple of each other. Any function with complexity \(O(\log n)\). is said to be log time.

< CHAPTER 8. Complexity | Contents | 8.2 Complexity Matters >

Python Numerical Methods

Complexity and Big-O Notation¶