5.00 credits
30.0 h + 15.0 h
Q2
Teacher(s)
Nijssen Siegfried;
Language
English
> French-friendly
> French-friendly
Main themes
An important task in data mining is the discovery of patterns in data. Patterns are recurring structures in data; they can provide interpretable explanations for observations in data, can help to gain a better understanding in the structure of data, can be used to build better models, and can be used to solve other computational tasks (such as the construction of database indexes or data compression). Patterns can be found in many different forms of data, including data from supermarkets, insurance companies, scientific experiments, social networks, software projects, and so on.
This course will provide an in-depth introduction to pattern mining. After an introduction to the basics of pattern mining, it will provide an in-depth discussion of a number of advanced pattern mining techniques.
Topics that will be discussed are:
This course will provide an in-depth introduction to pattern mining. After an introduction to the basics of pattern mining, it will provide an in-depth discussion of a number of advanced pattern mining techniques.
Topics that will be discussed are:
- Categories of pattern mining tasks, including pattern and pattern set mining, supervised and unsupervised pattern mining, dataset types,and pattern scoring functions;
- Algorithms for solving different pattern mining tasks;
- Data structures for making pattern mining more efficient;
- The implementation of pattern mining algorithms;
- Mathematical foundations for the different categories of pattern mining tasks;
- Complexity classes relevant to pattern mining;
- Applications of pattern mining, with a special focus on the application of pattern mining techniques in software engineering.
Learning outcomes
At the end of this learning unit, the student is able to : | |
1 |
Given the learning outcomes of the "Master in Computer Science and Engineering" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:
|
Content
- Frequent itemset mining: algorithms, data structures;
- Constraint-based itemset mining: algorithms, data structures;
- Patterns in sequences, trees, graphs: algorithms, data structures, complexity classes;
- Pattern mining in supervised data: scoring functions, algorithms;
- Pattern set mining in supervised data: scoring functions, models (decision trees, boosting), algorithms
- Pattern set mining in unsupervised data: scoring functions (minimum description length principle, maximum entropy), algorithms
- Applications of pattern mining: software repositories, traces, log files, cheminformatics, bioinformatics, industrial applications
Teaching methods
- Lectures
- Exercise sessions, during which exercises will be done that prepare for the exam and projets
- 3 projets
Evaluation methods
The final grade is determined by 3 projects and an exam that is organized at the end of the semester.
The grade is calculated following a 75% / 25% rule (final written exam / participation and grade obtained for projects during the semester). Every project counts equally.
Failure to comply with the methodological instructions communicated by the teacher, particularly with regard to the use of online resources or collaboration between students, will result in an overall mark of 0. The use of generative AI tools without prior permission is strictly prohibited.
The grade is calculated following a 75% / 25% rule (final written exam / participation and grade obtained for projects during the semester). Every project counts equally.
Failure to comply with the methodological instructions communicated by the teacher, particularly with regard to the use of online resources or collaboration between students, will result in an overall mark of 0. The use of generative AI tools without prior permission is strictly prohibited.
Other information
During this course students wil have to implement a number of projects in Python. This course is impossible to follow without prior knowledge of Python. Hence, students should have followed a prior course in Python, such LEPL1401, LINFO1101 or LSINC1101.
Online resources
Bibliography
Charu C. Aggarwal, Jiawei Han (Eds.), Frequent Pattern Mining, Springer 2014 (ISBN: 978-3-319-07820-5)
Chapitres de
Siegfried Nijssen, Albrecht Zimmermann and Luc De Raedt, Essentials of Pattern Mining.
Chapitres de
Siegfried Nijssen, Albrecht Zimmermann and Luc De Raedt, Essentials of Pattern Mining.
Faculty or entity
INFO
Programmes / formations proposant cette unité d'enseignement (UE)
Title of the programme
Sigle
Credits
Prerequisites
Learning outcomes
Master [120] in Data Science : Statistic
Master [120] in Computer Science and Engineering
Master [120] in Computer Science
Master [120] in Mathematical Engineering
Master [120] in Data Science Engineering
Master [120] in Data Science: Information Technology