Topics in machine learning A

This information is indicative and can be subject to change.
Topics in machine learning A
Teacher: Fabien NAVARRO

E-mail: [email protected]
ECTS: 2.5
Evaluation: final exam and/or project
Previsional Place and time: MSE, 6 sessions of 3hours from January to march.

Prerequisites: familiarity with linear algebra; a working knowledge of R or Python programming; familiarity with multiple linear regression.
Aim of the course: Upon completing this course, students should be able to: select the appropriate methods; implement these statistical methods; compare leading procedures based on statistical arguments; assess the prediction performance of a learning algorithm; apply these key insights into class activities using statistical software.
Syllabus: Starting from classical notions of shrinkage and sparsity, this course will cover regularization methods that are crucial to high-dimensional statistical learning. The syllabus includes feature selection and model selection, linear and nonlinear techniques for regression and for classification. The course will focus on methodological and algorithmic aspects, while trying to give an idea of the underlying theoretical foundations. Practical sessions will give the opportunity to apply the methods on real data sets using either R or Python. The course will alternate between lectures and practical lab sessions.

Main Subjects covered:
1. Subset Selection
1.1. Best Subset Selection
1.2. Stepwise Selection
1.3. Choosing the Optimal Model

2. Shrinkage Methods
2.1. Ridge Regression
2.2. The Lasso
2.3. Selecting the Tuning Parameter

3. Basis Expansions and Regularization
3.1. Smoothing Splines
3.2. Choosing the Smoothing Parameter

4. Generalized Additive Models
4.1. GAMs for Regression Problems
4.2. GAMs for Classification Problems

5. SVM and Boosting
5.1. SVM
5.2. Boosting

References:
· Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning: Data Mining, Inference and Prediction. Springer. Free download.
· James, G., Witten, D., Hastie, T., & Tibshirani, R. (2013). An introduction to statistical learning (Vol. 6). New York: springer. Free download.