Solution to STA 410/2102 Assignment #2, Spring 2003 - Discussion.


PART I.

As the default initial value for mu, I chose the median of the data.
The mean might also be a reasonable initial guess, but the median is
less influenced by extreme points, which occur more often with the
logistic distribution than with a normal distribution (since the tails
of the density function go down as exp(-x) rather than exp(-x^2)).

As seen in the output, this default initial value produced rapid
convergence, in only two iterations, on the test data in the
assignment handout.  The final estimates were in fact very close to
this initial value.  Other tests on randomly generated data sets
generated from a logistic distribution also resulted in fast
convergence from this default initial value.  Tests with data sets
containing outliers also worked fine.

As seen in the output, however, when other initial values were used,
Newton-Raphson sometimes diverged.  When Newton-Raphson did converge,
it did so quite rapidly, however.


PART II.

When estimating both mu and omega, I again used the median as the
initial value for mu.  As initial value for omega, I used the
interquartile range divided by log(9).  This would produce the correct
value for omega if the sample were large, as can be seen from the
formula for generating random values from a logistic distribution,
which is a monotonically increasing function of u.  If we plug u=3/4
and u=1/4 into that formula from the assignment handout, and take the
difference, we get
  
    (mu + omega*log((3/4)/(1/4))) - (mu + omega*log((1/4)/(3/4)))

which simplifies to omega*log(9).  So dividing this by log(9) will
produce omega.  Another possibility would be to use an initial
estimate for omega based on the sample standard deviation, but as
before, I preferred a method that isn't sensitive to outliers.

As seen in the output, these default initial values produced rapid
converge (in only three or four iterations) for the test data from the
handout.  These initial values worked well for other randomly
generated data sets as well, and when the data included outliers. 

For many other initial values, Newton-Raphson diverged.  In fact, it
converged for only two of the twenty tests I did.


PART III.

I tried finding the mle using the built-in nlm function, giving it
only the function for evaluating (minus) the log likelihood and
initial values (no derivatives were supplied).

I found that nlm converged rapidly when estimating only mu, even for
the initial values for which Newton-Raphson failed.  When estimating
both mu and omega, nlm also converged when given the default initial
values described above.  For other initial values, it converged in ten
of the same twenty tests used for Newton-Raphson, which indicates that
it is more robust than Newton-Raphson, but still far from being
totally immune to bad starting values.

I compared the time for Newton-Raphson to converge with that for nlm
on a large randomly-generated data set with 10000 observations.  The
Newton-Raphson procedure was run for two iterations, since that was
what was needed to obtain an accurate answer; nlm required four
iterations to achieve nearly the same accuracy.  The time required was
0.21 seconds for Newton-Raphson and 0.45 seconds for nlm.  The time
for Newton-Raphson might be reduced by rewriting the program so as to
compute some of the sub-expressions common to the various derivatives
only once.  The time for nlm might be reduced by supplying it with the
functions for computing derivatives, which it can take advantage of if
it is given them.