STA450S/4000S: Topics in statistics Spring 2005
Statistical Aspects of Data Mining
Meets in Lash Miller, 158, Wednesday 1-3, Friday at 1.
**Note On Fridays given jointly with STA 410S.
This course will consider topics in statistics that have played a role in
the development of techniques for data mining and machine learning. We will
cover linear methods for regression and classification, nonparametric
regression and classification methods, generalized additive models, aspects
of model inference and model selection, model averaging and tree based
Prerequisite: Either STA 302H (regression) or CSC 411H (machine learning).
This course is also offered to graduate students, either MSc or PhD, as a
Graduate students should consult the Statistics Graduate Office, Sidney Smith
Textbook: Hastie, Tibshirani and Friedman. The Elements of Statistical Learning.
April 6, 2005
March 30, 2005
March 23, 2005
March 16, 2005
March 9, 2005
March 2, 2005
February 25, 2005
February 23, 2005
February 9, 2005
- Class cancelled due to illness
- Homework 2 is available: due March 2
February 2, 2005
January 28, 2005
January 26, 2005
January 19, 2005
January 12, 2005
January 7, 2005: Trying R
Once you have a Cquest account, or have downloaded R onto your computer, try
running R (Just type R in a terminal window on Cquest, or find the "CQUEST"
icon, I think under the "Start" menu and double-click on R)
and having a look at the prostrate cancer data set.
This is available on the
web page , but I have also put it into /u/reid on Cquest, so the following
Then you can try things like dim(pr) and names(pr) and so on.
January 5, 2005
If you missed the class and are interested in taking it please
email me asap.
Undergraduates have the choice of downloading R to their own computer, or using
R or Splus on cquest. You are also welcome to use Matlab. To get an account on
Cquest go to
the Cquest home page and
request an account.
There is a wealth of information about the NMMAPS study on
Francesca Dominici's web site .