SSC2006


Last year I was most gratified to receive the Gold Medal of the Statistical Society of Canada. I thank the great number of friends who must have worked very hard for this. That was very kind.

This page contains links to material contained in the acceptance address "Statistical Inference for Public Policy" presented at the SSC annual meeting May 30, 2006.

Slides

A pdf file of the slides presented in the talk are at SSC 2006 slides (~700K).

Symbolic Computation in R

Symbolic Computation for Statistical Inference has been implemented in R.  Expressions are computed.  These can be transformed into R functions for numerical evaluation.  The R procedures are found at Symbolic Computation in R (4K)

Examples of Symbolic R

Example 1.  The expected value of the bootstrap variance of the square of a mean.

source(file="http://www.fisher.utstat.utoronto.ca/david/Sym2006/SCSI.1.0.2.txt")

set.seed(42); n<- 20; y <- rnorm(n); # y Generate Gauss; 42 is the answer to everything.
AA <- S(A(X) * A(X));  V <- S( EA(AA * AA) - EA(AA) * EA(AA) )

# Compute the bootstrap estimate of the variance, make it a function and evaluate it.
BV <- S( BE(V) ); BVF <- Sfn( BV, X)(y)*n*n # -> 9.26
 E <- GaussMoment;  Sfn( V, X )(y)*n*n # Gauss -> 2

# Compute the expected value of the bootstrap, make it a function and evaluate it.
EBV <- S( EA( BV ) ); Sfn( EBV, X )(y) * n * n # -> 5.78

# Compute the function for the unbiased estimate, make it a function and evaluate it.
UV <- S( AE( V ) ); Sfn( UV, X )(y) *n * n #-> 0.91

# Compute the expected value of unbiased estimate, make it a function and evaluate it.
EUV <- S( EA( UV ) );  Sfn( EUV, X )(y) * n * n # -> 2

Example 2.  The expected value of the deviance.

In this example, the mle is expanded and used to compute the expansion of the deviance.  The expectation of the deviance are computed.  

 theta0 <- S( APPROX(t0, 4) );  thetahat <- S( InverseA( l(theta0, 1) ) )
 hadev <- S( A(l(thetahat)) - A(l(theta0)) );  S( EZ(hadev + hadev) )


Symbolic Computation in browser

Symbolic computation for statistical inference may be easily done using a standard browser.  Details are found at PureStats (16K).

Examples

The code for the examples may be pasted in the input window above the buttons of the calculator.

Example 1.  The expected value of the bootstrap variance of the square of a mean.

In ths example the variance V of AA, the square of an average is computed.  The bootstrap variance is the same expression with expectations considered as averages.  The expected value of the expression involving averages is computed.

AA = A[X] A[X]; V = EfromKA[ AA, AA ]; EfromA[ V ] 

Example 2.  The expected value of the deviance.

The expression for the mle is used to compute the expression for the drop in log-liklihood or deviance.  The expected value of the deviance is computed.

Theta = AE[4, theta]; hdev = Taylor[l,0, Root[l, 1, Theta] ] - Taylor[l, 0, Theta];
dev = hdev + hdev;  EfromA[dev]

Example 3.  The third moment of the signed root.

The expression for the signed root of the drop in log-liklihood or deviance is used to compute it's third moment.   From the Bartlett identities this is found to be 0 to order n^-5/2 .

Set[ E[ l2 ] = -1];Set[ E[l1 l1] = 1]; Set[ E[l] = 0]
Theta = AE[4, theta]; t = Root[l, 1, Theta];dt = t - Theta;
rf2 = -1 Taylor[l,2,Theta]; rf3 = -0.66667 Taylor[l,3,Theta] dt;
rf4 = - 0.25 Taylor[l,4,Theta] dt dt; rf = rf2 + rf3 + rf4;
sr1 = (rf)^0.5; sr = sr1 dt;
EfromKA[sr, sr]
EfromKA[sr,sr,sr]

Cox-Tukey Confidence Intervals

The R code and an example of the Cox-Tukey intervals are given.  At this point there are two files: one for glm and one for coxph.  The code is designed to be general.  We hope to have only one file in the future.

Example

Example 1. Cox proportional hazards for Stanford Heart Transplant data.

In this example to Cox-Tukey interval is computed for the age effect on the Stanford Heart Transplant data.  The complete data are selected the significance of the age effect is computed and the ACox-Tukey interval is computed.  Note the calculation will take a few minutes -- which is small compared to the 52 lifetimes which were lost in the study.

source(file="http://www.fisher.utstat.utoronto.ca/david/SCS2006folder/coxtukeyph.txt")

library(survival)
good <- !is.na(stanford2$t5);  time <- stanford2$time[good]
age <- stanford2$age[good]; t5 <- stanford2$t5[good]
status <- stanford2$status[good]; n <- length(t5); w <- rep(1,n)
f = as.formula("Surv(time,status)~t5")
2* sig(0,coxph,f,age,w) #two sided p-value
cti(coxph,f,age,w,alpha=0.025)

Example 2.  Binomial glm for Stanford Heart Transplant data.

In ths example the same data is used for the confidence interval of the age effect in a logistic model for survival.   Note the calculation will take a few minutes -- which is small compared to the 52 lifetimes which were lost in the study.

source(file="http://www.fisher.utstat.utoronto.ca/david/SCS2006folder/coxtukeyglm.txt")
fg = as.formula("status~t5")
cti(glm,fg,age,w,alpha=0.025,family=binomial)

References

Bailar, J. C. (1992). Medical Uses of Statistics. 2nd Edition.NEJM Books. Boston See contributions by J. Bailar and L. Moses.

Cochrane, W. G. (1940). "The Analysis of Variance when Experimental Errors Follow the Poisson or Binomial Laws." Ann. Mth. Statist. 11, 335-347.

Cochrane, W. G. (1980). Fisher and the Analysis of Variance. in R. A. Fisher: An Appreciation. Springer.

Cox, D. R. (1990). Role of Models in Statistical Analysis. Statistical Science. 5. 169-174. Note, in particular, section 3.1

Feynmann R. P. (1999). The pleasure of finding things out : the best short works of Richard P. Feynman. Cambridge, Mass. : Perseus Books

Laplace, P. S. (1823). De l'action de la lune sur l'atmosphere. Annales de Chimie et de Physique. 24: 280-294

Stigler, S. M. (1986). The History of Satistics. Harvard University Press. Cambridge. For the use of total sum of squares see pages 153, 155.