This course treats a specific advanced topic or selection of topics of current research interest in the area of software engineering.
The actual topic(s) may vary from year to year, and will be chosen from a variety of software engineering domains such as data-intensive computing, software analytics, development and analysis of large evolving software systems, big data techniques, software repository mining, software recommendation systems, software visualization, novel programming technologies, software requirements and analysis,model-driven software engineering, software configuration management, software engineering processes, software engineering tools and methods, software testing and quality aspects, etc.
Given the learning outcomes of the "Master in Computer Science and Engineering" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:
- INFO1.1
- INFO3.1
- INFO6.3
Given the learning outcomes of the "Master [120] in Computer Science" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:
- SINF1.M3
- SINF3.1
- SINF6.3
The students shall acquire advanced theoretical knowledge and technical competences about the topics covered in the course.
The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.
25% for the exercises + Written Exam
- Lecture
- 3 exercises, one with a focus on applications in software engineering
An important task in data mining is the discovery of patterns in data. Patterns are recurring structures in data; they can provide interpretable explanations for observations in data, can help to gain a better understanding in the structure of data, can be used to build better models, and can be used to solve other computational tasks (such as the construction of database indexes or data compression). Patterns can be found in many different forms of data, including data from supermarkets, insurance companies, scientific experiments, social networks, software projects, and so on.
This course will provide an in-depth introduction to pattern mining. After an introduction to the basics of pattern mining, it will provide an in-depth discussion of a number of advanced pattern mining techniques.
Topics that will be discussed are:
- Categories of pattern mining tasks, including pattern and pattern set mining, supervised and unsupervised pattern mining, dataset types,and pattern scoring functions;
- Algorithms for solving different pattern mining tasks;
- Data structures for making pattern mining more efficient;
- The implementation of pattern mining algorithms;
- Mathematical foundations for the different categories of pattern mining tasks;
- Complexity classes relevant to pattern mining;
- Applications of pattern mining, with a special focus on the application of pattern mining techniques in software engineering.
- Frequent itemset mining: algorithms, data structures;
- Constraint-based itemset mining: algorithms, data structures;
- Patterns in sequences, trees, graphs: algorithms, data structures, complexity classes;
- Pattern mining in supervised data: scoring functions, algorithms;
- Pattern set mining in supervised data: scoring functions, models (decision trees, boosting), algorithms
- Pattern set mining in unsupervised data: scoring functions (minimum description length principle, maximum entropy), algorithms
- Applications of pattern mining: software repositories, traces, log files, cheminformatics, bioinformatics, industrial applications
- Charu C. Aggarwal, Jiawei Han (Eds.), Frequent Pattern Mining, Springer 2014 (ISBN: 978-3-319-07820-5)
Chapters from Siegfried Nijssen, Albrecht Zimmermann and Luc De Raedt, Essentials of Pattern Mining.
All relevant course material and slides as well as practical information related to the course will be accessible on Moodle, which will also be the primary means of communication between the teacher(s) and the students.
Background:
- Having a good knowledge of programming and basic software engineering concepts.
- Having prior experience with the development of a medium- to large-scale software system.