SDS 323: Statistical Learning and Inference

This course is an introduction to statistical inference, broadly construed as the process of drawing conclusions from data, and of quantifying uncertainty about said conclusions. The goal is to introduce the basic ideas of statistical learning and predictive modeling from a statistical, theoretical and computational perspective, together with applications to real data. Topics cover the major schools of thought that influence modern scientific practice, including classical frequentist methods, machine learning and Bayesian inference. The course aims to provide a very applied overview of some classical linear approaches such as Linear Regression, Logistic Regression, Linear Discriminant Analysis, as well as some non-linear methods such as K-Means Clustering, K-Nearest Neighbors, Generalized Additive Models, Decision Trees, Boosting, Bagging and Support Vector Machines.

Tentative schedule and weekly learning goals

The following schedule is tentative and will be updated throughout the course.

Topic	Assignment	Due	Readings (ISLA)
Introduction	HW0		1
Statistical Learning overview			2
R Session: Introduction to R
Introduction to the Linear Model			3.1
Multiple Linear regression and potential problems	HW1	HW0	3.2, 3.3
R Session: Linear Regression
Classification			4.1, 4.2, 4.3
Classification			4.4, 4.5
R Session: Classification	HW2	HW1
Resampling methods			5.1, 5.2
R Session: Resampling methods			5.2
Linear Model selection			6.1
Linear model regularization		HW2	6.2, 6.3
Midterm Exam 1
R Session: Model selection	HW3
Moving beyond linearity			7.1, 7.2, 7.3, 7.4
Moving beyond linearity			7.5, 7.6, 7.7
R Session: Moving beyond linearity
Tree based methods	HW4	HW3	8.1
Tree based methods			8.2
R Session: Tree based methods
Support Vector Machines			9.1, 9.2, 9.3
R Session: Support Vector Machines	HW5	HW4
Midterm Exam 2
Unsupervised Learning			10
Unsupervised Learning			10
Thanksgiving		HW5
R Session: Unsupervised Learning
Special topic: Intro to Neural Networks