The question we will try to answer in this blog post is as follows: How can we measure the effectiveness/performance of an algorithm?
First let’s consider this quote from Bill Gates (Founder of Microsoft):
So, according to Bill Gates the length of a program (in lines of code) is not a criteria to consider when evaluating its effectiveness to solve a problem or its performance.
- A long program does not necessarly mean that the program has been coded the most effectively.
- And vice-versa, a shorter program does not necessarly perform better than a longer piece of code.
Big O Notation
The Big O notation is used in Computer Science to describe the performance (e.g. execution time or space used) of an algorithm.
The Big O notation can be used to compare the performance of different search algorithms (e.g. linear search vs. binary search), sorting algorithms (insertion sort, bubble sort, merge sort etc.), backtracking and heuristic algorithms, etc. It is especially useful to compare algorithms which will require a large number of steps and/or manipulate a large volume of data (e.g. Big Data algorithms).
Best Case, Average Case or Worst Case Scenario?
The Big O notation can be used to describe either the best case, average case or worst-case scenario of an algorithm. For instance, let’s consider a linear search (e.g. finding a user by its username in a list of 100 users). In the best case scenario, the username being searched would be the first username of the list. In this case the algorithm would complete the search very effectively, in just one iteration. However, the worst case scenario would be that the username being searched is the last of the list. In this case the algorithm would require 100 iterations to find it.
Considering that the average or worst case scenario, we can deduct that a linear search amongst N records could take up to N iterations. There is a linear correlation between the number of records in the data set being searched and the number of iterations of the average case and worst case scenarios.
In this case a linear search would have the following time complexity big O notation:
- Best Case Scenario:Constant Notation: O(1)
- Average Case Scenario: Linear Notation: O(N)
- Worst Case Scenario: Linear Notation: O(N)
So let’s review the different types of algorithm that can be classified using the Big O Notation:
Constant Notation: O(1)
The constant notation describes an algorithm that will always execute in the same execution time regardless of the size of the data set.
For instance, an algorithm to retrieve the first value of a data set, will always be completed in one step, regardless of the number of values in the data set.
FUNCTION getFirstElemnt(list) RETURN list[0] END FUNCTION
A hashing algorithm is an O(1) algorithm that can be used to very effectively locate/search a value/key when the data is stored using a hash table. It’s a more effective way than using a linear search O(N) or binary search O(Log(N)) algorithm. (See example blog post on hashing algorithm for memory addressing)
Linear Notation: O(N)
A linear algorithm is used when the execution time of an algorithm grows in direct proportion to the size of the data set it is processing.
Algorithms, such as the linear search, which are based on a single loop to iterate through each value of the data set are more likely to have a linear notation O(N) though this is not always the case (e.g. binary search).
FUNCTION linearSearch(list, value) FOR EACH element IN list IF (element == value) RETURN true END IF NEXT RETURN false END FUNCTION
Polynomial Notation: O(N2), O(N3), etc.
Polynomial algorithms include quadratic algorithms O(N2), cubic algorithms O(N3) and so on:
- O(N2) represents an algorithm whose performance is directly proportional to the square of the size of the data set.
- O(N3) represents an algorithm whose performance is directly proportional to the cube of the size of the data set.
- etc.
Algorithms which are based on nested loops are more likely to have a polynomial O(N2), or O(N3), etc. depending on the level of nesting.
PROCEDURE displayTimesTable() FOR i FROM 1 TO 10 FOR j FROM 1 TO 10 product = i*j OUTPUT i + " times " + j + " equals " + product NEXT j NEXT i END PROCEDURE
Typically, O(N2) algorithms can be found when manipulating 2-dimensional arrays, O(N3) algorithms can be found when manipulating 3-dimensional arrays and so on.
PROCEDURE emptyChessboardGrid() FOR row FROM 0 to 7 FOR col FROM 0 to 7 grid[row][col] = 0 NEXT col NEXT row END PROCEDURE
Most sorting algorithms such as Bubble Sort, Insertion Sort, Quick Sort algorithms are O(N2) algorithms.
Exponential Notation: O(2N)
The exponential notation O(2N) describes an algorithm whose growth doubles with each addition to the data set.
Backtracking algorithms which test every possible “pathway” to solve a problem can be based on this notation. Such algorithms become very slow as the data set increases.
Example of exponential algorithm: An algorithm to list all the possible binary permutations depending on the number of digits (bits).
Logarithmic Notation: O(log(N))
A logarithmic algorithm O(log(N)) is an algorithm whose growth decreases when the data set increase following a logarithmic curve. Logarithmic algorithms are hence quite efficient especially when processing large sets of data.
A binary search is a typical example of logarithmic algorithm. In a binary search, half of the data set is discarded after each iteration. Which means that an algorithm which searches through 2,000,000 values will just need one more iteration than if the data set only contained 1,000,000 values.
Use a logarithmic algorithm (based on a binary search) to play the game Guess the Number.