STA 437 is the undergraduate version of this course. STA 1005 is the graduate version, which may be taken for credit only by graduate students who are not in Statistics.
Phone: (416) 978-4970
Mondays 6:10pm to 9:00pm, from September 13 to December 6, PLUS 6:10pm to 9:00pm Wednesday December 8, EXCEPT for October 11 (Thanksgiving) and November 8 (term break). The December 8 lecture will not cover new material, and so could be skipped by students unable to come Wednesday evening.
Lectures are in Sidney Smith Hall, 100 St. George Street, room 1070.
I will hold office hours Tuesdays, from 4:30pm to 5:30pm, starting immediately. My office is in Sidney Smith Hall, 6th floor, room 6026A. Come see me if you have any administrative issues, or if you just want to discuss the course material.
The TA, L. Xu, will be providing help in the stat aid centre (SS 2133) Wednesdays from 3:10 to 4:00.
B. S. Everitt and G. Dunn, Applied Multivariate Data Analysis, 2nd edition.
Assignments will require use of the R statistics package. You can use this package on the CQUEST computer system, or install it for free on your own computer (running Microsoft Windows, Macintosh OS, or Linux).
You'll be able to get a CQUEST account once classes start at www.cquest.utoronto.ca.
The R package and documentation are at www.r-project.org. Here are some direct links to things available there:
- An introduction to R
- Current version of R for Windows. Click on the link for R 2.11.1 to download the setup program.
- R for Mac OS X.
- R for Linux. Some Linux distributions may come with R installed. Try typing "R".
45% Final exam, scheduled by the Faculty during the exam period.
20% Two assignments, each worth 10%, tentatively due Nov. 1 and Dec. 6.
10% Two in-class quizes, each worth 5%, on Sep. 27 and Nov. 15.
25% Mid-term test, tentatively scheduled for Oct. 18, 6:10-8:00, followed by discussion of answers (will discuss whether this is a suitable time in the first class).
Late assignments will be accepted only for legitimate reasons, such as illness. Answers to the quizes and the mid-term test will be discussed immediately afterward. If anyone misses a quiz or the mid-term test for a legitimate reason, that part of their course mark will be taken from other work.
Assignment #1: Handout. Due November 15 (changed from original date). Here are the data files:Part I: wind speeds on Sundays, for Unix/Linux/Mac, or Microsoft WindowsHere are the R hints for assignment 1. Here are the solutions to Part I and Part II. Note that these are just example solutions. You don't necessarily have to have done everything that I did, and at some points, it would be reasonable to make different analysis decisions than I did.
Part I: wind speeds on Mondays, for Unix/Linux/Mac, or Microsoft Windows
Part II: gene expression data, for Unix/Linux/Mac, or Microsoft Windows
Part II: patient information, for Unix/Linux/Mac, or Microsoft Windows
Assignment #2: Handout. Due December 6. Here are the R hints for assignment 2. A model solution is here.
Quizes & mid-term test:
Quiz #1: Held Sept. 27. Here is the quiz paper with answers.
Mid-term test: Held Oct. 18. Here is the test paper.
The final exam schedule is here.
Here are two old exams: 2008, 2009.
Other useful information:
Lecture notes. Updated 2010-11-26. Note that not all the material we've covered is mentioned in these notes, which mostly just summarize definitions and some theory.
Lattice graphics in R.
Paper on visual significance tests by Buja, et al. The R function I used for the class demo is here.
Air pollution data, from Table 2.5 in the textbook: Unix/Linux/Mac, or Microsoft Windows. Figure 2.13 can be approximately reproduced in R withlibrary(lattice) air <- read.table("air.dat",head=TRUE) xyplot(SO2 ~ wind | equal.count(temp,number=6), data=air)
Glucose data, from Table 8.8 in the textbook: Unix/Linux/Mac, or Microsoft Windows.
R functions for multivariate confidence intervals: Unix/Linus/Mac, or Windows.
Web pages for previous versions of this course:Fall 2009