STA450S/4000S: Topics in statistics Spring 2005
Statistical Aspects of Data Mining

Meets in Lash Miller, 158, Wednesday 1-3, Friday at 1.

**Note On Fridays given jointly with STA 410S.

Course Information
This course will consider topics in statistics that have played a role in the development of techniques for data mining and machine learning. We will cover linear methods for regression and classification, nonparametric regression and classification methods, generalized additive models, aspects of model inference and model selection, model averaging and tree based methods.

Prerequisite: Either STA 302H (regression) or CSC 411H (machine learning).
This course is also offered to graduate students, either MSc or PhD, as a reading course.
Graduate students should consult the Statistics Graduate Office, Sidney Smith Room 6022.

Textbook: Hastie, Tibshirani and Friedman. The Elements of Statistical Learning. Springer-Verlag.

Book web page

April 6, 2005

March 30, 2005

March 23, 2005

March 16, 2005

March 9, 2005

March 2, 2005

February 25, 2005

February 23, 2005

February 9, 2005

  • Class cancelled due to illness
  • Homework 2 is available: due March 2

February 2, 2005

January 28, 2005

January 26, 2005

January 19, 2005

January 12, 2005

January 7, 2005: Trying R

Once you have a Cquest account, or have downloaded R onto your computer, try running R (Just type R in a terminal window on Cquest, or find the "CQUEST" icon, I think under the "Start" menu and double-click on R) and having a look at the prostrate cancer data set. This is available on the book web page , but I have also put it into /u/reid on Cquest, so the following *should* work:

pr<- read.table("/u/reid/450/",header=T)

Then you can try things like dim(pr) and names(pr) and so on.

January 5, 2005

If you missed the class and are interested in taking it please email me asap.
Undergraduates have the choice of downloading R to their own computer, or using R or Splus on cquest. You are also welcome to use Matlab. To get an account on Cquest go to the Cquest home page and request an account.
There is a wealth of information about the NMMAPS study on Francesca Dominici's web site .