PURE STATS is a handy calculator for doing pure statistics: statistics without numbers. It is a tiny (36K) applet that you can access from the web. With it, you can compute averages, moments, cumulants, unbiased estimates, k-statistics, asymptotic expansions of MLE's, deviances, their expected values and much more.-- all without numbers. The examples illustrate how these computations may be made with a little input and a few button pushes.
Access it with PureStats. If the display does not seem appropriate, follow these instructions.
PURE STATS is written in Java 2. To use it you need a Java2 applet viewer or a Java2 enabled browser. It has been tested on Safari, Netscape and Internet Explorer
The calculator is designed to compute many of the expressions associated with random variables. In particular it computes moments cumulants and unbiased estimates of these.
We consider a sample of n observations of these random variables. The properties of averages assume that the n observations on random variables are independent. The variables themselves are not assumed to be independent. Thus, for example, the cumulants of the sample correlation coefficient may be computed.
The calculator handles random variables. You can give them meaningful names like height and weight or short generic identifiers like X and Y. We will use the latter to shorten the discussion below. To keep things simple and accessible to all, we use the Canadian alphabet. There will be no confusing subscripts or Greek here. There are also no numbers.
The expectation of a random variable is denoted by E. Thus E[X] denotes the expectation of the random variable X, E[X Y] denotes the expectation of the product X Y.
The average of a sample of size n of the random variable X is denoted by A[X]. The average of a sample of n products X Y is denoted by A[X Y]. Note that A[XY] denotes the average of a single random variable called XY.
Cumulants are convenient polynomials of expectations. They are denoted by K. For example K[X, X] denotes the second cumulant of the random variable X. K[X, Y] denotes the covariance of X and Y. Generalized cumulants are denoted for example by K[X Y, Z]. For a further discussion of generalized cumulants see McCullagh (1987).
To simplify output we will use the letter u, an inverted n to represent 1/n. This greatly reduces the space required for output. In the same spirit, u1 denotes (1/(n-1), u2 denotes 1/(n-2) etc.
Simple identities relate cumulants and expectations. Similar identities relate averages and expectations. These identities are used to compute transformations between types of expressions. These have the form illustrated by EfromK which produces an expression in terms of expectations from an expression in terms of cumulants. Now the order of E and K in EfromK may seem odd. It is handy when composing transformations as in KfromE[ EfromK[ K[X Y, Z] ] ] which expresses a generalized cumulant K[X Y, Z] in terms of simple cumulants K[X,Y] etc.
EfromK[ K[X, X]]
KfromE[ E[X] E[Y] ]
e.g. EfromA[ A[X] A[Y] ]
AfromE[ EfromK[ K[X, X, X] ] ]
which computes an unbiased estimate of a the third cumulant ot Xe.g. AE[4, (A[X])^6]
computes the asymptotic expansion of the sixth power of the average A[X]. The expansion is computed to n^(-4/2).mle = Root[ l, 1, AE[4, theta] ]
computes the expansion of the mle, the root of the average 1'st derivative of the likelihood l to order n^(-4/2)Taylor[ l, 0, mle]
computesA[ l[mle] ]
the average components of the devianceThe following example shows input and output from the applet window. The example first computes m3 the third cumulant of X expressed in terms of expectations. In the second step, the unbiased estimate of the cumulant is computed and expressed, as noted above, in terms of products of averages multiplied by factors including u, u1, ... (This estimate is a k-statistic.) The output below was cut and pasted from the applet's input and output windows. The output contains lots of line breaks. These are handy if not pretty.
m3 = EfromK[ K[ X, X, X] ]
Input m3 = EfromK[ K[ X, X, X] ] ->
E[ X X X ]
+ (-3.0) E[ X X ] E[ X ]
+ (2.0) E[ X ] E[ X ] E[ X ]
AfromE[ m3 ]
Input AfromE[ m3 ] ->
( 1.0
-8.0u u2
+ 8.0u1 u2
+ 7.0u1
-4.0u) A[ X X X ]
+ ( -3.0
-12.0u1 u2
-9.0u1) A[ X X ] A[ X ]
+ ( 2.0
+ 4.0u1 u2
+ 2.0u1
+ 4.0u2) A[ X ] A[ X ] A[ X ]
This example involves 3 steps. In the first, expectations of likelihood derivatives are specified to simplify the resulting expressions. Step 2 illustrates the computation of the mle, the root of the average first derivative of the likelihood function, to order n^(-3/2). In the third step this result is used to expand the average likelihood ratio statistic, multiply it by 2 and then compute the expected value of the average deviance. Remember that l denotes the likelihood of one observation, l1 the first derivative of l, l2 the second derivative of l ...
Set[ E[ l2 ] = -1];Set[ E[l1 l1] = 1]; Set[ E[l] = 0]
Input Set[ E[ l2 ] = -1];Set[ E[l1 l1] = 1]; Set[ E[l] = 0] ->
Root[l, 1, AE[3, theta] ]
Input Root[l, 1, AE[3, theta] ] ->
Series
(theta)
+ Z[ l1 ]
+ (0.5E[ l3]) Z[ l1 ] Z[ l1 ]
+ Z[ l1 ] Z[ l2 ]
+ ( 0.5E[ l3]^2
+ 0.167E[ l4]) Z[ l1 ] Z[ l1 ] Z[ l1 ]
+ (1.5E[ l3]) Z[ l1 ] Z[ l1 ] Z[ l2 ]
+ (0.5) Z[ l1 ] Z[ l1 ] Z[ l3 ]
+ Z[ l1 ] Z[ l2 ] Z[ l2 ]
Theta = AE[4, theta]; hdev = Taylor[l,0, Root[l, 1, Theta] ] - Taylor[l, 0, Theta];dev = hdev + hdev; EfromA[dev]
Input Theta = AE[4, theta]; hdev = Taylor[l,0, Root[l, 1, Theta] ] - Taylor[l, 0, Theta];dev = hdev + hdev; EfromA[dev] ->
Series
+
+ (u)
+
+ ( 2.0E[ l1 l2]^2 u^2
+ 3.0E[ l1 l2] E[ l3] u^2
+ E[ l1 l3] u^2
+ E[ l1^2 l2] u^2
+ 0.333E[ l1^3] E[ l3] u^2
+ E[ l2^2] u^2
+ 0.75E[ l3]^2 u^2
+ 0.25E[ l4] u^2 )
This example illustrates the expansion of cumulants of the standard t-statistic. In step 1, expectations of the random variable are specified to simplify the resulting expressions. In step 2, asymptotic expansions of the average and of the gaussian mle estimate of the variance are specified to order n^(-4/2). In step 3 the t-statistic is defined as the average times the inverse of the square root of the variance estimate. Note that the usual t-statistic differs by the root of n/(n-1). In the remaining steps the expansions of the cumulants of this t-statistic are computed.
Set[E[X] = 0]; Set[E[X X] = 1]
Input Set[E[X] = 0]; Set[E[X X] = 1] ->
xbar = AE[4, A[X]]; vhat = AE[4, A[X X]] - xbar xbar
Input xbar = AE[4, A[X]]; vhat = AE[4, A[X X]] - xbar xbar ->
Series
1
+ Z[ X X ]
+ (-1.0) Z[ X ] Z[ X ]
+
+
sdinv = (vhat)^-0.5; tstat = xbar sdinv
Input sdinv = (vhat)^-0.5; tstat = xbar sdinv ->
Series
+ Z[ X ]
+ (-0.5) Z[ X X ] Z[ X ]
+ (0.375) Z[ X X ] Z[ X X ] Z[ X ]
+ (0.5) Z[ X ] Z[ X ] Z[ X ]
+ (-0.312) Z[ X X ] Z[ X X ] Z[ X
X ] Z[ X ]
+ (-0.75) Z[ X X ] Z[ X ] Z[ X ] Z[
X ]
EfromA[tstat]
Input EfromA[tstat] ->
Series
+
+ (-0.5E[ X^3] u)
+
+ ( -0.937E[ X^3] E[ X^4] u^2
-1.562E[ X^3] u^2
+ 0.375E[ X^5] u^2 )
EfromKA[tstat, tstat]
Input EfromKA[tstat, tstat] ->
Series
+
+ (u)
+
+ ( 1.75E[ X^3]^2 u^2
+ 3.0u^2 )
EfromKA[tstat, tstat, tstat]
Input EfromKA[tstat, tstat, tstat] ->
Series
+
+
+
+ (-2.0E[ X^3] u^2 )
As seen in the figure, PureStats has a number of windows and buttons. The main window is divided into several areas for input and output. The topmost area is for logging all input and output. Next is a one-line message area which displays short instructions for both input and output. The third area is the input window. At the bottom are a series of buttons. In addition to the main window, PureStats opens an auxillary help window that displays further information about the buttons used and calculations performed.
If you push a button, sample input is placed in the input window which then may be edited. A more complete description appears in the auxillary help window. To calculate the expression in the input window push CALC ( or ENTER ). Also note the Scroll and Noscroll buttons, which apply to the logging area at the top of the main window.
If the display does not seem appropriate, save the page source and modify the height and width to suit your display. To do this open Source Code. Unzip the contents. The file small.html may then be modified to change the window size and opened either with a browser or an appletviewer.
The content of the applet owes much to Peter McCullagh's wonderful book Tensor Methods in Statsistics (1987). Many people assisted in the development of this applet. Hanna Jankowski provided many frank and helpful comments which guided the development of the applet interface. Radford Neal, Jeffrey Rosenthal, and Keith Knight offered many helpful suggestions on the interface. Jamie Stafford provided the inspiration by pointing out the list handling of Java classes. Duncan Andrews, Wayne Oldford and John Chambers helped in the off-site testing of the applet. Stuart Andrews made significant improvements to this page and suggestions for other changes. The editors of the Canadian Journal of Statistics confirmed my views on the elegant simplicity and usefulness of the non-Greek, subscript-free notation. I am grateful to all.
For more information see Asymptotic Expansions of Moments and Cumulants a postscript version of the paper presented at the Workshop on Symbolic Computation in Statistics, Montreal September 1997. This presents basic procedures for symbolic computation for statistical inference with examples including those of symbolic bootstrap calculations.
For a more extensive presentation see Andrews and Stafford, Symbolic Computation for Statistical Computing. Oxford University Press.