Current Research
I work in the area of statistical inference and some of the computational
problems that arise when we want to apply statistical methodology.
Most of my recent work is in the area of Bayesian inference where
the computational challenges typically involve the need to evaluate
integrals.
Bayesian Inference
Consider a situation where we have specified a sampling model, a proper
prior for the model parameter and these are the only ingredients, i.e.,
we do not specify a loss or utility function.
The principle of conditional probability, as expressed via Bayes Theorem,
tells us that any probability statements we make after seeing the
data must be based on the posterior. This does not tell
us, however, what form inferences should take. For this we
need additional requirements that we
want inferences to satisfy.
My research is concerned with what these should be when we have
a proper prior and has lead to
consideration of the measurement of the concept of surprise,
invariance of inferences under reparameterizations, and optimality
of Bayesian inferences with respect to repeated sampling behavior.
- Bayesian inference procedures derived
via the concept of relative surprise. M. Evans.
Communications in Statistics, Vol. 26, No. 5, p. 1125-1143, 1997.
The basic idea is that inferences are based on how beliefs change from a priori
to a posteriori, i.e., we base inferences on how the data has changed beliefs.
Often these inferences are very similar to more traditional Bayesian inferences
but in some cases they are quite different.
-
Robustness of relative surprise
inferences to choice of prior. M. Evans and T. Zou. Recent Advances in Statistical Methods,
Proceedings of Statistics 2001 Canada: The 4th Conference in Applied
Statistics Montreal, Canada 6 - 8 July 2001, Yogendra P. Chaubey (ed.),
p. 90-115, Imperial College Press. This paper shows that relative surprise
inferences have superior
robustness properties, with respect to choice of prior, when compared
to traditional Bayesian inferences.
This paper also shows how relative surprise inferences can be thought of
as a generalization of, and calibration of, Bayes factors.
- Optimality and computations
for relative surprise inferences. M. Evans, I.Guttman
and T. Swartz. Canadian Journal of Statistics, Vol. 34, No. 1, 2006, 0 113-129.
This paper establishes an optimality property for relative
surprise inferences. In particular, relative surprise inferences are
hpd-like in the sense that a relative surprise region, containing
gamma of the posterior probability, has the smallest prior
content among all sets with this property.
-
Consistency of Bayesian estimates for the sum of squared normal means
with a normal prior. M. Evans and M. Shakhatreh.
Technical Report No. 0607, Feb. 21, 2007.
This paper deals with a well-known example that is commonly cited as
showing that Bayesian procedures can lead to poor behavior such
as inconsistency. Rigorous consistency results are established
and it is shown that the estimate based on relative surprise
(the LRSE or least relative surprise estimator) is consistent.
This establishes that it isn't Bayesian procedures that are at fault,
or the choice of prior, rather it is the use of procedures such as
the posterior mean and mode that leads to the inconsistency.
- Optimal properties of some Bayesian
inferences. M. Evans
and M. Shakhatreh. Technical Report No. 0710, October 1, 2007
and in the Electronic Journal of Statistics, Vol. 2, 2008,
p. 1268-1280.
This paper establishes that relative surprise
regions have optimal properties in repeated sampling in the sense
that they minimize the probability of covering false values. Further
it is shown that relative surprise regions are unbiased and maximize
Bayes factors.
Model Checking and Checking for Prior-data Conflict
A common objection to Bayesian inference is the fact that the
prior represents the subjective beliefs of the statistician or analyst.
In many contexts we don't want inferences to depend
on the particular analyst.
Bayesians often respond to this by pointing out that the sampling
model is also a subjective choice made by the statistician.
For many scientific analyses ,
the objections concerning the subjective nature of the choices
made by the analyst, and this applies both to the sampling
model and the prior, seem quite sensible to me.
The way out of this
dilemma for me is through checking that the sampling model makes sense in light
of the data collected and that, when the sampling model makes sense,
that there is no prior-data conflict. The sampling model makes sense
when there is at least one distribution in the model for which the
observed data is not surprising. Then, when this is the case, there
is a prior-data conflict when the prior places its mass primarily
on distributions in the model for which the observed data is surprising.
Model failure is very serious as it makes us doubt the validity
of the inferences drawn. The existence of a prior-data conflict
can also lead us to doubt the validity of our inferences but, provided
we have enough data, its effects can be ignored. For this reason we
should check separately for model error and for prior-data conflict.
- Checking for prior-data conflict. M. Evans
and H. Moshonov. Technical Report
0413, February, 2005 and published in Bayesian Analysis, Volume 1, Number 4,
2006, pp. 893-914. This paper develops a methodology for checking
for prior-data conflict distinct from checking the sampling model. It
is shown that this leads to a partial characterization of noninformativity.
- Checking for prior-data
conflict with hierarchically specified priors. M. Evans
and H. Moshonov. Technical Report
0503, June, 2005 and published in
Bayesian Statistics and its Applications,
eds. A.K. Upadhyay, U. Singh, D. Dey, Anamaya Publishers, New Delhi,
2007, p. 145-159.
This paper extends the methodology for checking for prior-data
conflict to checking for prior-data conflict with individual
components when the prior is specified hierarchically.
- Comment on "Bayesian checking of the second levels
of hierarchical models'' by Bayarri and Castellanos for Discussion
in Statistical Science. M. Evans. Technical Report No. 0708 July, 2007
and appeared as Statistical Science, 22, 3, p. 344-348.
- Invariant P-values for Model
Checking and Checking for Prior-data Conflict.
Michael Evans and Gun Ho Jang. Technical Report No. 0803 June, 2008
(a modified version has been accepted for publication
in the Annals of Statistics).
This paper discusses in general a sensible definition of a P-value
when we observe data from a single specified distribution. While it is easy
to see what an appropriate definition is in the discrete case, the natural
analog in the continuous case has a serious defect in that it is not
invariant under a change of variable. A reasonable way around this problem
is to require that P-values for general discrepancy
statistics not depend on volume distortions. The results
are applied in model checking and checking for prior-data conflict contexts.
The original report was modified to add an example where it is shown that
the conditions of Theorem 2, beyond continuity of the density, are necessary.
- The information in one prior
relative to another.
Michael Evans and Gun Ho Jang. Technical Report No. 0809 November, 2008.
This uses the methods for checking for prior-data conflict in an
attempt to answer a question raised by Andrew Gelman, namely, providing
a definition of what it means for a prior to be weakly informative
with respect to a base prior.
Integration
- Discussion of Nested Sampling for
Bayesian Computations by John Skilling. M. Evans.
Technical Report No. 0608 September 1, 2006, and appeared in Bayesian Statistics 8, Proceedings of the Eighth Valencia International Meeting, Jun2 2-6, 2006, eds Bernardo, J.M., Bayarri, M.J., Berger, J.O., Dawid, A.P., Heckerman, D., Smith, A.F.M. and West, M., 507-512.
This paper discusses
Skilling's algorithm, expressing the method in more common statistical
terminology, and proving convergence of the algorithm under certain conditions.
- Fast and accurate calculation of a computationally intensive
statistic for mapping disease genes.
Seok, S-C., Evans,M. and Vieland, V.J. (2009).
Journal of Computational Biology. May 2009, 16(5), 659-676.