SICP Exercise 2.72

Question

Consider the encoding procedure that you designed in Exercise 2.68. What is the order of growth in the number of steps needed to encode a symbol? Be sure to include the number of steps needed to search the symbol list at each node encountered. To answer this question in general is difficult. Consider the special case where the relative frequencies of the \(n\) symbols are as described in Exercise 2.71, and give the order of growth (as a function of \(n\)) of the number of steps needed to encode the most frequent and least frequent symbols in the alphabet.

Answer

Let us start by reviewing what procedures are being called by encode-symbol.

(define (encode-symbol symbol tree)
  (let ((left (left-branch tree))
        (right (right-branch tree)))
    (cond ((leaf? tree) '())
          ((element-of-set? symbol (symbols left))
           (cons '0 (encode-symbol symbol left)))
          ((element-of-set? symbol (symbols right))
           (cons '1 (encode-symbol symbol right))))))

First, we have left-branch and right-branch. These are simply car and cadr and have an order of growth of \(\Theta(1)\).

Next, we call leaf?, which is also just a car with \(\Theta(1)\). After this, we call element-of-set? regardless of how the conditional statement evaluates. This is a recursive function with \(\Theta(n)\).

This function is called recursively on either the right or left half of the tree, giving it a complexity of \(\Theta(\log n)\) for a balanced tree.

But, of course, our trees in this case are not balanced. So for the highest frequency character, the function will be called only once, giving us a total complexity of \(\Theta(n)\), whereas for the least common, it will be called \(n\) times, giving us \(\Theta(n^2)\).