SICP Exercise 2.72
Question
Consider the encoding procedure that you designed in Exercise 2.68. What is the order of growth in the number of steps needed to encode a symbol? Be sure to include the number of steps needed to search the symbol list at each node encountered. To answer this question in general is difficult. Consider the special case where the relative frequencies of the \(n\) symbols are as described in Exercise 2.71, and give the order of growth (as a function of \(n\)) of the number of steps needed to encode the most frequent and least frequent symbols in the alphabet.
Answer
Let us start by reviewing what procedures are being called by encode-symbol
.
(define (encode-symbol symbol tree)
(let ((left (left-branch tree))
(right (right-branch tree)))
(cond ((leaf? tree) '())
((element-of-set? symbol (symbols left))
(cons '0 (encode-symbol symbol left)))
((element-of-set? symbol (symbols right))
(cons '1 (encode-symbol symbol right))))))
First, we have left-branch
and right-branch
. These are simply car
and
cadr
and have an order of growth of \(\Theta(1)\).
Next, we call leaf?
, which is also just a car
with \(\Theta(1)\). After this,
we call element-of-set?
regardless of how the conditional statement
evaluates. This is a recursive function with \(\Theta(n)\).
This function is called recursively on either the right or left half of the tree, giving it a complexity of \(\Theta(\log n)\) for a balanced tree.
But, of course, our trees in this case are not balanced. So for the highest frequency character, the function will be called only once, giving us a total complexity of \(\Theta(n)\), whereas for the least common, it will be called \(n\) times, giving us \(\Theta(n^2)\).