AP CS
-
Big-O notation essentially
allows you to compare the time of execution
for various computer processes as they tie up
the use of the microprocessor. Since various
microprocessors have various speeds and
background running processes also tie up
computer time, there is no way of
guaranteeing that a specific algorithm will
take exactly "x" seconds to
execute. Big-O allow you to discuss the
comparative time usage relationships among
various algorithms.
The execution of a single statement would
take time "1". The execution of
"n" statements would take time
"n". Bubble sort is slower than
Quick sort. But why? A linear search of an
ordered list is slower than a binary search.
Why?
By looking at the general structure of the
algorithm, certain processes can be
identified that take computer time. For
example, in a linear search of an ordered
list, the "worst" that can happen
is that the "key" you seek is the
last element. Therefore, we say that a linear
search is of order O(n) = n. That is, for a
list of "n" elements, the execution
speed is time "n". Obviously, for a
slow computer, "n" time is longer
than on a fast computer. But it is still
"n" time. Time is always looked at
as the maximum amount of time that it might
take for the algorithm to process.
For a binary search of the same list, since
we divide the list in half, in quarters, in
eighths, etc. until we find the
"key", a list of 100 (for example)
elements would take no more than 7
executions. Why 7? The answer is that 2^6
< 100 < 2^7 . Or written another way, 6
< log(100) < 7 where log is in base
2. And why base 2? Because
"binary" (halves, quarters,
eighths, etc. are powers of 2. Now referring
to Big-O, we would say that the Order of
Magnitude of speed for a binary search is O(n)
= log(n) (again in base 2). That is, for an
ordered list of "n" elements,
execution time for a binary search is log(n)
units of time.
For a loop that traverses from " 1 to n
" or " 1 to n-1 " the
execution time would be considered O(n) = n .
Note that "n" is viewed to be large
and that the time difference between an n of
10000 or n-1 of 9999 is insignificant. Also,
if loops are nested, the time is the product
of each of the loop times. For example, if
the outer loop is " 1 to n-1 " and
the inner loop is " 1 to n-1 " ,
the product is (n-1)(n-1) or n2 -
2n + 1. Now for a sufficiently large n, the
values 2n and 1 are considered insignificant
when compared to n2. Therefore,
for this example, the Order of Magnitude is
O(n) = n^2 . That is, for a list of
"n" items, the time of execution is
considered to be n2 units of time.
Now even though an Insertion Sort is (in real
time on YOUR machine) quicker than a
Selection Sort, which is quicker than a
Bubble Sort, the Big-O time for each is O(n)
= n2 because of the actual
structure of the algorithm. Yes I know that
two nested FOR loops must run to complete
execution whereas a REPEAT or a WHILE used in
loops "may" permit early
termination of the loops, but Big-O considers
worst case execution time, and Big-O is not
really an absolute, but only a general guide
for studying those time relationships.
When considering a Quick sort, you can see
that the algorithm is a linear process
("it" must eventually happen to
each array element) nested with a binary
process (subdividing the list similar to a
binary search). The product of these yields a
Big-O time O(n) = n*(log n) (again base 2).
To advance this, "if it takes 10 seconds
for an n^2 algorithm to process 1000 numbers,
how long will it take for the same algorithm
to process 5000 numbers?" We would say
that "the time it takes for execution of
an algorithm is directly proportional to the
order relationship." As an equation, we
could write this as follows:
T(n) = K * O ( f(n) )
where T(n) is the Time for n numbers
K
is the constant of proportionality
O
( f(n) ) is the Big-O for a specific function
f(n)
If we plug-in the values
above it will look like the following:
10 = K * 10002
K = 10 / 10002
Now that we know the constant, we use the
formula again the determine the time for the
second set of numbers.
T(n) = (10 / 10002 ) * (50002)
or T(n) = (10 / 10002 ) * ( 52
* 10002)
or T(n) = 10 * 52
or T(n) = 250 seconds
See if the new set if 5 times as big (1000 *
5 = 5000 ) and the algorithm is an n2
algorithm, the new time is the old time
multiplied by this value of 52 ---
10*52 = 250 seconds. Think: Can
you work situations like "100 seconds
for 10000 numbers under an n2
algorithm - how much time for 1000
numbers".
Discussion: Think Big-O graphically to
compare in a visual way that for small "n's",
the kind of sort (for example) might be
anything, but as "n" grows,
inefficient algorithms like Bubble Sort take
a significantly larger amount of time than
Quick Sort (or some other efficient sort).
Graphs of these relationships will open your
understanding.
Big-O discussions are now possible about
Merge Sorting, or even Hash Coded Searching.
For a more complete discussion you might try
the following sources:
"Fundamentals of Data Structures in
Pascal", Horowitz & Sahni, Computer
Science Press, pp 20-23.
"Introduction to Data Structures and
Algorithm Analysis with Pascal", Naps
& Pothering, West Publishing, pp15-21.
"Data Structures via C++ - Objects by
Evolution", A. Michael Berman, Oxford
University Press
|