lstat2110a  2019-2020  Louvain-la-Neuve

Note from June 29, 2020
Although we do not yet know how long the social distancing related to the Covid-19 pandemic will last, and regardless of the changes that had to be made in the evaluation of the June 2020 session in relation to what is provided for in this learning unit description, new learnig unit evaluation methods may still be adopted by the teachers; details of these methods have been - or will be - communicated to the students by the teachers, as soon as possible.
3 credits
15.0 h + 7.5 h
Segers Johan;
Main themes
Contents: - Reminders of algebra and geometry useful for multivariate data analysis - Basic principles of factorial methods - Principal components analysis (PCA) - Canonical correlation - Factorial discriminant analysis (FDA) - Factorial correspondence analysis (FCA simple and multiple) - Cluster analysis - Data analysis in practice
  • Data matrices
  • Principal component analysis
  • Classification: k-means clustering and hierarchical clustering
  • Linear discriminant analysis
  • Simple and multiple correspondence analysis
  • Principal component regression
  • Partial least squares regression
Implementation of the methods is done in the R language using the RStudio integrated development environment, and the R Markdown framework is used to combine text, mathematical formulas, R code and R output (tables, graphs).
Teaching methods
During the lectures, the teacher presents the various statistical methods, covering the questions and data-sets to which they apply, the underlying mathematical theory, and how to program them in R. Homework assignments are given, the solution of which is discussed in the lectures too.
The tutorials take place in computer rooms and have as primary objective to allow the students to train themselves in applying the method on real data-sets in R.
Evaluation methods
Tests during the lectures:
  • Test 1: Data matrices and principal component analysis
  • Test 2: Clustering and linear discriminant analysis
Participation is optional. At the discretion of the student, each test can replace the part of the exam on the same topic.
Exam (12/20):
  • written, closed book, with the help of a formula list and a pocket calculator
  • exercises and questions involving (small) calculcations, interpretation of computer output, and understanding of the main results and formulas
Project (8/20):
  • individually or in pairs
  • data application, the data being sought by the students themselves
  • written report in R Markdown, to be submitted before the exam session
  • detailed instructions will be provided in the exercise sessions and on the MoodleUCL course page
Other information
  • vector and matrix calculus
  • Euclidean geometry: points, spaces, orthogonality, distances, angles
  • basic notions in statistiques: sample mean, (co)variance, correlation, covariance matrix, conditional probabilities, normal distribution, chi-square distribution
Online resources
All teaching material is made available through the MoodleUCL cours page: slides, exercises, software scripts. In addition, links to interesting external material are given too: on-line courses, videos, software documentation.
  • Escofier, B. et Pagès, J. (2016): Analyses factorielles simples et multiples, 5e édition, Dunod, Paris.
  • Lebart, L., Piron, M. et Morineau, A. (2006): Statistique exploratoire multidimensionnelle, 4e édition, Dunod, Paris.
  • Saporta, G. (2011): Probabilités, analyse des données et statistique, 3e édition révisée, Editions TECHNIP, Paris.
Faculty or entity

Programmes / formations proposant cette unité d'enseignement (UE)

Title of the programme
Master [120] in Environmental Bioengineering

Master [120] in Agriculture and Bio-industries

Master [120] in Forests and Natural Areas Engineering