‘Big O’* /: > Notation + Analysis():
Written by: Jeffrey Ng — Prospective Data Scientist
Background
Big O was a term coined first in mathematics by Paul Bachmann in 1894 in the second volume of his text Analytische Zahlentheorie (Analytic Number Theory) to be adopted and used by Edmund Landau, a number theorist in 1909 for his writings. For that reason, Big O was alternatively called Landau symbols. It was a term used in research in the 1950’s for number theory. By the 1970’s, Big O was popularized in computer science by Donald Knuth (Jan 10, 1938- present), Stanford professor who made vast contributions in computer science in his era through multi-volume texts (“The Art of Computer Programming”), discoveries in theoretical computer science, various computer programming systems, and also pioneered mathematics in the analysis of computation of complex algorithms.
Big O/Big Theta/Big Omega/Algorithmic Efficiency/ Time Complexity are names that refer to the runtime of a piece of code. The runtime is plotted relationally to the size of input. Obviously, as the size of the input increases so would the amount of time (runtime) it takes to run the program. But Big O gives us a function that compares the relationship between runtime and size of data. Big O does not mean, “oh I have a faster computer, therefore my runtime would be less!”
That mode of logic is obviously false.
Big O, Big Theta Notation is a description, in a function, of the runtime with respect to the size of input. This (x, y) relationship can be described as linear, constant, or quadratic.
Linear relationships describe a rate of change that is consistent. Meaning as the size of the function increases by a certain amount, the same relationship happens to increase in a linear of identical manner. An example is O(n).
Constant relationships describe a rate of change that is ‘constant’ or unchanging, meaning the size of the function has NO effect on the runtime. The runtime stays the same whether the function has one variable or 100 variables. An example of this is O(1).
The quadratic relationship can best be described as non-linear, with curvature meaning as x increases y increases by an multiplied amount. An example of this is O(n²).
There are several factors that influence the runtime of your code which I shall explain, and later more about the function notation that is used to descriptively, and mathematically quantize the efficiency of your code.
Programmers often talk about the Best Case (most efficient) code, the Worse Case (least efficient) and the Expected Case (somewhere in between!).
The several factors that affect Big O are:
- The amount of memory your code takes up as in how many kernels it will take to store.
- The multiplication or addition of your functions in your code.
- The resulting downsize of functions as the code progresses.
- Kernels
The space complexity is of utmost importance because more kernels mean more operations for the computer to handle. This may be the first dimension of Big O.
2. Multiplication / Addition
Any operations that square, multiple, raise to a higher power, factorial etc. would make the code slower versus addition or a step by step program. This idea is moved towards loops which has a recursive element which puts in the multiplying category versus any code that is more linear
3. Downsizing of Your Code
Do your programming methods, functions, and arrays tend to get reduced and smaller as the program runs? This often refers to the tightness, or boundness of your programming. This will also lead to faster code.
BIG O Notation
The speed or velocity of your written programming is quantized through Big O/Big Theta/ Big Omega Notation which is a function that describes your items in your code versus the runtime. These functions have ’N’ as your main variable (or any other variable one would like to use) set equal to O with a mathematical function attached to ‘N.’ Examples are from fastest to slowest are: O(1), O(log(n)), O(n), O(n²), O(2^n), O(n!).

There are some rules in determining your notation and O(n) that can eliminate certain coefficients, or negligible constants.
An example of O(1) which means that the size of your data has no effect on runtime would be:

Another example came across through my research was lets just say we used carrier pigeons to deliver our data to New Jersey. In this example our data would be stored in a USB drive of 1TB. In this case no matter how much or little data we store on the USB drive, it would take the carrier pigeon the same amount of time to fly this data to New Jersey!

In order to see how data can influence our runtime, for the above function we can .append elements to it and see how it increases our runtime. If we plotted this data in a graph we would get data size values versus runtime value relationship that would be 1. linear, 2. constant or 3. quadratic and look similar to one of the curves in the graph above.
Big O Notation can take form in infinite combinations. As data scientists we look to simplify these functions to give a ‘backbone’ function that ususally fits into one of the function identities from above. There are many ways to simplify these functions but there are general strategies to adhere by.
Rules of thumb: look at the part of the function that is the most ‘beefy.’ Then drop the coefficients associated with the ‘beef’ and get rid of the constants that are added or subtracted at the edges of the function. The resulting ‘backbone’ of the function will describe its runtime in Big O Notation. This relationship can be described as constant, linear, or quadratic. Knowing this can tell us the nature of your coding efficiency.
Other ways we can determine Big Theta efficiency is to plot your functions run time vs the size of your data. In this example function we would like to see how adding more input onto your function would affect its runtime.
In closing, Big O and its notation is an important and popular point of discussion in the computer science field. It is a basic notion and mantra of computer coding that beginners as well as experts need to keep in mind. Having a basic understanding from the very beginning helps programmers write better, more efficient code. In a world of powerful chip processors, supercomputers as well as super complex algorithms and vast data, keeping in mind Big O will save time, money, and ourselves.