Solution to STA 410/2102 Assignment #2, Spring 2003 - Discussion. PART I. As the default initial value for mu, I chose the median of the data. The mean might also be a reasonable initial guess, but the median is less influenced by extreme points, which occur more often with the logistic distribution than with a normal distribution (since the tails of the density function go down as exp(-x) rather than exp(-x^2)). As seen in the output, this default initial value produced rapid convergence, in only two iterations, on the test data in the assignment handout. The final estimates were in fact very close to this initial value. Other tests on randomly generated data sets generated from a logistic distribution also resulted in fast convergence from this default initial value. Tests with data sets containing outliers also worked fine. As seen in the output, however, when other initial values were used, Newton-Raphson sometimes diverged. When Newton-Raphson did converge, it did so quite rapidly, however. PART II. When estimating both mu and omega, I again used the median as the initial value for mu. As initial value for omega, I used the interquartile range divided by log(9). This would produce the correct value for omega if the sample were large, as can be seen from the formula for generating random values from a logistic distribution, which is a monotonically increasing function of u. If we plug u=3/4 and u=1/4 into that formula from the assignment handout, and take the difference, we get (mu + omega*log((3/4)/(1/4))) - (mu + omega*log((1/4)/(3/4))) which simplifies to omega*log(9). So dividing this by log(9) will produce omega. Another possibility would be to use an initial estimate for omega based on the sample standard deviation, but as before, I preferred a method that isn't sensitive to outliers. As seen in the output, these default initial values produced rapid converge (in only three or four iterations) for the test data from the handout. These initial values worked well for other randomly generated data sets as well, and when the data included outliers. For many other initial values, Newton-Raphson diverged. In fact, it converged for only two of the twenty tests I did. PART III. I tried finding the mle using the built-in nlm function, giving it only the function for evaluating (minus) the log likelihood and initial values (no derivatives were supplied). I found that nlm converged rapidly when estimating only mu, even for the initial values for which Newton-Raphson failed. When estimating both mu and omega, nlm also converged when given the default initial values described above. For other initial values, it converged in ten of the same twenty tests used for Newton-Raphson, which indicates that it is more robust than Newton-Raphson, but still far from being totally immune to bad starting values. I compared the time for Newton-Raphson to converge with that for nlm on a large randomly-generated data set with 10000 observations. The Newton-Raphson procedure was run for two iterations, since that was what was needed to obtain an accurate answer; nlm required four iterations to achieve nearly the same accuracy. The time required was 0.21 seconds for Newton-Raphson and 0.45 seconds for nlm. The time for Newton-Raphson might be reduced by rewriting the program so as to compute some of the sub-expressions common to the various derivatives only once. The time for nlm might be reduced by supplying it with the functions for computing derivatives, which it can take advantage of if it is given them.