Discussion for Assignment 2, Part II.

From the plots, and from the numbers, we can see that when the uniform
distribution is used for symbols and p=0, the two methods perform
equally well.  This is because neither method will achieve anything in
this context, in which there is neither dependence nor non-uniformity
in probabilities.  The position of every symbol looked up will be
uniformly distributed between 1 and 100, with average of 50.5.

When symbols are chosen uniformly, but recent symbols are likely to be
looked up again (p=0.5), the move-to-front scheme works much better,
since half the symbols are found near the front of the list.  The
average time to find a symbol is almost cut in half.  In contrast, the
move-one-forward scheme does only slightly better than it did with
p=0, since it only slowly moves symbols to the front.  (One can see
symbols slowly moving in the plot, but they don't get too far before
they drop out of the last 10, and aren't accessed much any more.)

When the Zipf's law distribution is used with p=0, both the
move-to-front and the move-one-forward scheme work well, since both
will eventually move the common symbols to the front of the list.  The
performance of the move-one-forward scheme eventually becomes better
than that of the move-to-front scheme (look at the mean over the last
4000 lookups to see this).  This is because the move-one-forward
scheme can reach a nearly stable state in which symbols are arranged
according to probability, whereas the move-to-front scheme will
occasionally move uncommon symbol to the front, where they don't
really belong.  However, the move-to-front scheme can be better at
first, since it can move the most common symbols to the front more
quickly.  In the plots, you can see that the move-one-forward scheme
is only slowly moving symbols forward during the initial lookups.

When p=0.5 with the Zipf's law distribution, the move-to-front scheme
is better, since it can quickly move the recently accessed symbols to
the front.  

The three repetitions of the runs show some substantial variability,
particularly for the move-one-forward scheme.  For example, with
Zipf's law and p=0, I got means for the 8000 lookups of 23.2, 27.1,
and 26.3.  This is greater than you would expect from an average of
8000 numbers with standard deviation of about 26, if the numbers were
independent.  (You'd expect a standard deviation for the averages of
26/sqrt(8000) = 0.29.)  The explanation is that the positions at which
the symbol was found for the 8000 lookups are not independent.  For
the move-one-forward scheme, the initial ordering of symbols has a big
effect, since it can move them to the front only slowly, so this
initial order can affect all 8000 numbers.

A scheme that moved symbols forward more than one position, but not
all the way to the front, might combine most of the advantages of both
schemes, since it could reach a nearly optimal stable state, like the
move-one-forward scheme, but do so more quickly.