Re: Mergesort: why's it efficient?

Subject: Re: Mergesort: why's it efficient?
From: Erik Naggum <nobody@naggum.no>
Date: 1996/12/06
Newsgroups: comp.lang.c++,comp.sys.mac.programmer.help,comp.lang.lisp
Message-ID: <3058824781104480@naggum.no>


* Dann Corbit
| If the problem is tiny, it may not matter what sort of algorithm, since
| all will finish in a blink.  If the problem is huge, then it matters
| enormously, because for some problem size, O(N*log(N)) will always be
| better than O(n^2).

it appears that many people don't undertand the O notation.  O indicates
complexity, not execution speed.  the execution speed depends on a constant
factor that is insignificant compared to the variable factor as it grows
without bound.  a function of complexity O(n log n) may have a constant
factor (k) that makes execution time (or other resources) _higher_ than a
function of complexity O(n^2) with a much smaller constant (c) as long as
the relation

    k       n
   --- >> -----
    c     log n

holds.  however, your statement that O(n log n) will _always_ be better
than O(n^2) does not hold, precisely because of the possibility of the
above relation.

since Andrew Koenig surprised many by limiting n and then arguing how
insignificant log n is compared to n, as if this is anything but obvious,
others have followed in his muddled footsteps and also conveniently forget
the constant factor or the actual meaning of the O notation.  I'm frankly
amazed that this is possible for even half-educated computer scientists.

the O notation does _not_ indicate execution time.  the O notation is the
_simplest_ function that yields an upper bound on the _relation_ between
the length of the input and the execution time (or space).

I recommend that Andrew Koenig and others who have been confused by him
read Alfred V. Aho and Jeffrey D. Ullman's excellent book "Foundations of
Computer Science" (Computer Science Press, 1992; ISBN 0-7167-8233-2), which
has grown out of the course CS109 -- Introduction to Computer Science at
Stanford University.  the topic of chapter 3 is the running time of
programs.  section 3.10 specifically analyses merge sort.  it should be
instructive to more than the target audience of students, and any
programmer worth his salt _should_ be able to understand this chapter with
little effort.  in particular, their treatment of constants and of the
simplification involved in the O notation is fundamental knowledge and
_should_ not come as a surprise to anyone who has been following these
discussions.  _please_ take the time to look up this book.  (Dr. Aho and
Dr. Ullman have both worked at Bell Labs, which is why I chose this
textbook over many other candidates.  this will hopefully make it harder
for Andrew Koenig to ignore them, as he is wont to do for most everything
else shown to correct or contradict his inaccurate statements.)

#\Erik
-- 
Please address private replies, only, to "erik".  Junk mail, spam,
stupid flames, courtesy copies, etc, should be sent to "nobody".