This course develops the elements introduced in the basic Probability and Statistics courses within a multivariate framework, the aim being to equip students with the instruments they need to analyse multidimensional data sets. By the end of the course, students should be able to use the most widely-used instruments to analyse real data. A key aim of the course will therefore be to give students a clear understanding of the methods and how to apply them, and how to use relevant analytical software.
Main themes
Part 1: Basic descriptive methods and basic notations.
In this part, students are taught how matrix notation facilitates treatment of multidimensional data and basic properties of random vectors. They will also learn that the basic (uni-and bivariate) descriptive tools have both their uses and limitations.
Part 2: Techniques of multivariate data analysis.
In this part, students learn about basic dimension reduction techniques for continuous and qualitative variables (principal components, correspondence analysis). Basic classification techniques are also presented. A wide range of examples is given to illustrate these methods and show when they should be used.
Part 3: Multivariate analysis models.
In this part, students see how to model inter-variable relations: linear models (including variance and variance-covariance analysis) which make it possible to use explanatory variables to explain response variable variation. Models adapted to categorical response variable are also introduced, log-linear models for contingency tables, the logit model and discrimination analysis models. Here too, a wide range of examples is given to illustrate these methods and show when they should be used.
Content and teaching methods
Résumé : contenu et méthodes
Course content:
Introduction to multivariate methods, matrix notations and basic properties of random vectors, basic descriptive tools, principal component analysis, simple and multiple correspondence analysis, classification, regression models, including ANOVA and ANCOVA, categorical variable models, discrimination analysis.
Method:
The course comprises:
- lectures (the teacher introduces concepts on the basis of concrete applications and abstracts from this),
- computer-based practical exercise sessions, using software to analyse authentic data
Other information (prerequisite, evaluation (assessment methods), course materials recommended readings, ...)
Course materials (for information only) : Simar (2003: An Introduction to Multivariate Data Analysis, manuscript, 233p., Institut de Statistique, Université catholiquede Louvain, Louvain-la-Neuve