Stat 445: Introduction to Exploratory Data Analysis

This is archived information for Stat 445 Sect 201 (Spring, 2005).

Course News

April 26, 2005: The final projects have been marked and the grades calculated and submitted. You should be able to view them through the Student Service Centre shortly, I assume.

The final projects were generally very good. The mean and median project marks were 72% and 73% respectively, and the quartile range was 66–77%. The final grades were calculated on the following basis:

Marks
Lab Assignments20
Midterm Project #115
Midterm Project #220
Final Proposal3
Final Project42

and the mean and median final grades were both 75% with a quartile range of 70–80%, quite high compared to the marks the last time this course was taught. I guess I'm just a big softie.

I should be in and out of the office this week, for those who'd like to pick up their Midterm Project #2. In particular, I plan to be in LSK 362 most of the day on Thursday (knock loudly on the door to room 360 to get my attention), or you can make an appointment by email. Also, you can make an appointment to "view" your (group's) Final Project and the comments I made. However, I'll be treating the final projects like final exams, so I'll need to keep the originals.

I hope you all enjoyed the class (and that you still feel the same way after you've viewed your final grades). Have a good summer!

April 25, 2005: I'm mostly done marking the final projects, but I have a few more to finish up today, and then it'll take me some time to total all the marks up and see if any curving/scaling is called for. I'll post another note up here when I'm finished.

April 22, 2005: I've finished grading Midterm Project #2. You can see the marking results on the projects page. I'm still grading the final reports, and they won't be finished until Monday.

April 18, 2005: Your final projects are in, and I have a full week of grading ahead of me. I'll post a note here when I've finished marking Midterm Project #2 (which should be Wednesday or Thursday), and I'll have the final projects marked and your final grades calculated by next Monday, April 25.

Thanks for your attention this term, and I hope you enjoyed the class. Congratulations to those who are graduating this year (and good luck in your future studies to those who aren't)!

April 17, 2005: More Notes on Final Project: The project is due tomorrow (Monday) at 5pm. As with the previous project, you can hand it in…

I will confirm electronic submissions.

Also, it does not look as if I'll have Project #2 graded by tomorrow, as I originally thought. My office hours last week were a little busier than I expected. It'll take me at least until Wednesday and maybe Thursday before they're ready.

April 12, 2005: Notes on Final Project: I've put up some notes on the final project (page limits, etc.) on the project page. Remember that it's due next Monday, April 18.

I'll be holding extended office hours this week to offer help with the final projects. I'll be in LSK room 362 or 371. Both these rooms are in the closed area behind "room" 360. If that door is locked, knock loudly, and I should be able to hear you. My office hours are:

I'll be marking midterm #2 projects all this week, and they should probably be done by next Monday, but I'll post details here about when and where to pick them up.

April 10, 2005: Well, classes are over, obviously! Hope you enjoyed STAT 445, and good luck with your final exams.

Remember: Midterm Project #2 is due tomorrow (Monday, April 11) at 5:00pm. You can hand it in…

April 7, 2005: I've put up the slides I covered last class (sorry I forgot) and some more slides for our final class tomorrow. Also, I'll be holding some extra office hours in either LSK 371 or LSK 362 after class from 2:30-3:30.

April 5, 2005: I just put up the slides I covered yesterday. Sorry about the delay. Note that I have extended office hours this week (and I'll be in either LSK 371 or next door in LSK 362):

I'll also hold some office hours next week to help people with the final projects, but the times are yet to be announced.

Finally, note that I've extended the due date for Midterm Project #2 to Monday, April 11 at 5:00pm. I will work out the details of how it should be handed in and announce them in class and post them here.

April 1, 2005: I've put up a final version of the slides I covered on Wednesday and today.

March 29, 2005: I've put up a preliminary version of a few slides on generalized linear models (and logistic regression for binomial data in particular) from Chapter 7 for tomorrow's class.

March 23, 2005: The sample clinical trial report I mentioned in class is available on the projects page. I've also posted some additional slides about Project #1: I didn't get to them today, but I hope I'll be able to say a few words about them next Wednesday.

I do have office hours tomorrow (Thursday) at 11am. Of course, there's no class (or office hours) on Friday or Monday because the university is closed. Have a good long weekend, and I'll see you next Wednesday!

Note: Don't forget that both Assignment #8 and #9 are due next week.

March 22, 2005: I've put up some slides for tomorrow when I'll finish up the permutation test examples and talk about Projects #1 and #2 a bit.

March 20, 2005: Midterm Project #2, which I will hand out tomorrow in class, is currently available on the project page. Also available on that page are the Marking Results for Project #1 including links to six projects that received some of the highest marks.

March 16, 2005: Oops. A student pointed out that I made a mistake in Monday's lecture. The correct approximate distribution for Fisher's variance stabilization of the sample correlation coefficient is:

g(r)=(1/2)log((1+r)/(1-r)) is approx N((1/2)log((1+rho)/(1-rho)),1/(n-3))

(I might have written the variance incorrectly as 1/(n+3).) Note that the version printed on Assignment #8, Question 1(d) is correct.

Speaking of assignments, note that the due date for Assignment #8 has been extended one week. There will still be a (small) Assignment #9 assigned next week, and both assignments will be due March 29 or 31.

March 15, 2005: I've put up some preliminary slides for tomorrow's lecture.

Also, grading of Project #1 is more than half done now (28 out of 46 marked). As the following grading progress graph indicates, I'm projected to complete grading in time to hand them all back on Friday.

Grading
    Progress Graph

March 11, 2005: Proposal for Final Project: I'll be assigning the Proposal for the Final Project today. This (one page) proposal will be due Wednesday, March 23 in class. The final project itself will be due Monday, April 18 at 5:00pm.

March 9, 2005: Sorry about the delay. I've added slides 65-70 for today's class.

March 7, 2005: Office hours are cancelled for Tuesday and Thursday this week. Sorry about the short notice.

February 28, 2005: Final Project Groups: For the final project, you have the option of working individually or in groups, and I'd like to get the group assignments settled by next week. You should spend the next week organizing your group, and each group should elect a "group leader" to act as a central contact person. The group leader should print off a copy of the group sign-up sheet and hand it in with his or her assignment #6 next week. I only need one sheet per group. If you want to work on your own, you should also hand in a copy of the sheet listing yourself as group leader.

February 27, 2005: A preliminary version of slides 46-58 covering up to the end of Section 5.5: Robust Summaries is now available.

February 25, 2005: Just a reminder that Jafar has posted a note about one-sided tests on the lab page. Other than that, have a good weekend!

February 23, 2005: I've posted an updated version of the slides used today, since I added a couple slides at the last minute.

Note that office hours are cancelled tomorrow. My normal office hours will resume next week.

February 21, 2005: Reminder: Midterm Project #1 is due Wednesday in class. I'll be holding some extended office hours tomorrow (Tuesday) from 2:00-4:00 in the usual room LSK 371 to answer any last-minute project questions.

In other news, the bookstore has a few (three) copies of our textbook in stock.

February 14, 2005: A page of Rough Marking Guidelines is now available on the projects page.

February 11, 2005: Next week is Midterm Break, so there'll be no classes and no regularly scheduled office hours. However, I'll be reading my e-mail regularly over the break if you have any project-related questions.

I'll be posting a very rough project marking scheme on the projects page later this evening or this weekend so you know what sort of things I'm looking for.

Have a great break!

February 10, 2005: We'll continue our coverage of t-tests and Wilcoxon tests in section 5.4 tomorrow. I've put up a preliminary version of the first few slides.

February 7, 2005: I handed out the first project today. It's worth 15% of the final grade, and it's due Wednesday, February 23 in class. Both the assignment and the dataset are available on the projects page.

We covered slides 18-22 today. On Wednesday, we'll finish up section 5.3 and start into section 5.4, covering some of the classical and not-so-classical statistical tests that R has available.

Because next week is Midterm Break and the project is due the following week, there will be no lab assignment handed out this week. Instead, you can use the lab time to work on the project and get R help from Jafar. Note that the due date for assignment #4 has not changed. It's still due in this week's lab.

February 3, 2005: I've added slides 18-28 covering section 5.3. We'll be trying to cover slides 11-28 tomorrow, but it doesn't seem likely given my track record. Anyway, we'll see how far we get.

February 2, 2005: Next class, we should be able to finish up to the end of section 5.3. Both sections 5.2 and 5.3 are short, and we've already covered most of the material there. For now, slides up to slide 17 (covering up to the end of section 5.2) are available.

February 1, 2005: Update: Slides 5-13 for tomorrow's class are now available.

Due to a bizarre miscommunication, the person who was going to order our textbook was waiting on me to give a final number of copies before placing the order. Anyway, five copies have now been ordered from the highly reputable Login Bros outfit in Toronto, and they are expected to arrive February 11.

We didn't even come close to finishing up sections 5.1 and 5.2 on Monday, obviously, but I'll pick up with that tomorrow. Since many of the slides are already done, I'll be able to get most of them up early this evening.

January 28, 2005: On Monday, we'll finish up our coverage of sections 5.1 and 5.2.

I have been given the disturbing (though perhaps not surprising) news that the bookstore has been telling people they have no record of our textbook order. Since I talked to a Real Person about placing an order and have that Real Person's name and number, I'll follow up on Monday and find out what's happening. In the meantime, a copy is available in the Main Library reserve collection.

Have a great weekend!

January 26, 2005: We got through slides 67-73 today. I'll quickly zip through slides 74-81 (and probably skip a few) in the first half of Friday's class so we can finish up our eleven-lecture "Introduction" to R and get started on Chapter 5: Univariate Statistics.

January 24, 2005: I've put up the final version of slides 54-66 that we covered in today's class.

January 22, 2005: As you can plainly see, I've reorganized the webpage in an effort to make it easier to quickly get the most useful information. In particular, the (first few) slides for Monday's lecture are available at the link on the right.

January 21, 2005: I know we rushed through slides 39-53 today, but I'm anxious to finish up our R intro by Monday or Wednesday so we can move on to the core material for this course. I'm hoping to finish up R graphics and some basic R programming by the middle of Wednesday's class and start on Chapter 5: Univariate Statistics.

January 19, 2005: The textbook is now available on reserve at the Main Library. On Friday, we'll begin our introduction to R graphics, starting with slides 39-47.

January 17, 2005: Textbook Update: The textbook should be available on reserve at the Main Library by tomorrow. Also, the bookstore has agreed to order 15 copies of the textbook which should arrive in one or two weeks.

I only managed to finish up the explanation of the diagnostic plots for lm(...) models today. Do you think perhaps I like hearing myself talk just a bit too much?

On Wednesday, I'll talk about the regression approach to one-factor ANOVA on slides 32-38 for the first half of class, and then I'll spend the rest of Wednesday's class and all of Friday's class talking about some R programming and graphics (slides to come).

As I mentioned in class, I've (almost) settled on the mark allocation and due dates for the projects and assignments:

MarksDue Date
Lab Assignments20weekly
Midterm Project #115approx Feb 23
Midterm Project #220approx Mar 23
Final Project45sometime Apr 12-26

The mark values and due dates are still subject to some fine-tuning, however.

January 14, 2005: Well, we only covered slides 27-31 today (relevant to lab assignment question 2). On Monday, I'll make a few more comments about the diagnostic plots on slide 31 and move on to the regression approach to one-factor ANOVA on slides 32-38 (relevant to question 1). Our class coverage of the material is lagging a little behind the lab assignment, but since this is really supposed to be review, I don't feel too bad making you go back and consult your STAT 306 notes if you have to.

Try to have a good weekend anyway!

January 12, 2005: See slides 21-26, which we covered today, and the preliminary version of slides 27-31, which we just barely got started on but will cover on Friday.

January 10, 2005: I've put a link to the slides I used in class today on the Slides and Handouts page. This time, I posted a preliminary version on the webpage this morning around 10am; I'm still working toward the goal of getting them up the night before so they're available first thing the morning before class.

Note that labs and office hours, as posted above, start this week. If you haven't already registered for one of the two lab sections, do so as soon as possible!

Important: For this course, there are two lab sections to choose from:

These lab sections are in the system, and you should register in one of them as soon as possible. Enrollment within each lab is limited, and registration is on a first-come, first-served basis.

January 7, 2005: Thanks for your attention today. I've put a link to the slides I used in class today on the Slides and Handouts page. In the future, I'll make an effort to get copies up on the webpage the morning before the class.

For those who haven't used R before, I would strongly suggest downloading and installing it on your home computer over the weekend and working through the "Introduction to R" document available on the Resources page. Even if you're familiar with R, it might be worth skimming through that document and trying out anything that doesn't look familiar.

Have a good weekend, and see you Monday!

January 5, 2005: Welcome back! I hope you had a good holiday break. See Slides and Handouts page for the handouts I passed out and slides I reviewed in class today.

This is archived information for Stat 445 Sect 201 (Spring, 2005).