Note from June 29, 2020
Although we do not yet know how long the social distancing related to the Covid-19 pandemic will last, and regardless of the changes that had to be made in the evaluation of the June 2020 session in relation to what is provided for in this learning unit description, new learnig unit evaluation methods may still be adopted by the teachers; details of these methods have been - or will be - communicated to the students by the teachers, as soon as possible.
Although we do not yet know how long the social distancing related to the Covid-19 pandemic will last, and regardless of the changes that had to be made in the evaluation of the June 2020 session in relation to what is provided for in this learning unit description, new learnig unit evaluation methods may still be adopted by the teachers; details of these methods have been - or will be - communicated to the students by the teachers, as soon as possible.
5 credits
30.0 h + 15.0 h
Q1
Teacher(s)
Dupont Pierre; Fairon Cédrick;
Language
English
Prerequisites
LINFO1121 Algorithmics and data structures https://uclouvain.be/en-cours-linfo1121.html
Main themes
- Basics in phonology, morphology, syntax and semantics
- Linguistic resources
- Part-of-speech tagging
- Statistical language modeling (N-grams and Hidden Markov Models)
- Robust parsing techniques, probabilistic context-free grammars
- Linguistics engineering applications such as spell or syntax checking software, POS tagging, document indexing and retrieval, text categorization
Aims
At the end of this learning unit, the student is able to : | |
1 |
Given the learning outcomes of the "Master in Computer Science and Engineering" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes:
|
The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.
Content
- Linguistic essentials: morphology, part-of-speech, phrase structure, semantics and pragmatics
- Corpus analysis: formating, tokenization, morphology, data tagging
- Probabilistic language models: N-grams, HMMs
- Part-of-Speech Tagging
- Probabilistic Context-Free Grammars: parameter estimation and parsing algorithms, tree banks
- Introduction to Machine Translation
- Lexical semantics
- Information extraction
- Typical Applications: POS taggers, information extraction tools, probabilistic parsers
Teaching methods
- Lectures
- Mini-projects (2 to 3 weeks) implemented, by default, in Python and by groups of 2 students
- Feedback sessions about the projects
Evaluation methods
The mini-projects are worth 25 % of the final grade, 75 % for the final exam (closed-book).
The mini-projects can NOT be implemented again in second session.
The 25 % for the mini-projects are fixed at the end of the semester and included as such in the global score for the second session.
The final exam is, by default, a written exam (on paper or, when appropriate, on a UCLouvain computer).
The mini-projects can NOT be implemented again in second session.
The 25 % for the mini-projects are fixed at the end of the semester and included as such in the global score for the second session.
The final exam is, by default, a written exam (on paper or, when appropriate, on a UCLouvain computer).
Online resources
Bibliography
One recommended textbook - un ouvrage conseillé :
Teaching materials
- Les supports obligatoires sont constitués de l'ensemble des documents (transparents des cours magistraux, énoncés des travaux pratiques, compléments, ...) disponibles sur le site Moodle du cours.
- Required teaching material include all documents (lecture slides, project assignments, complements, ...) available on the Moodle website for this course.
Faculty or entity
INFO
Programmes / formations proposant cette unité d'enseignement (UE)
Title of the programme
Sigle
Credits
Prerequisites
Aims
Master [120] in Data Science Engineering
Master [120] in Computer Science and Engineering
Master [120] in Linguistics
Master [120] in Computer Science
Master [120] in Data Science : Statistic
Master [120] in Data Science: Information Technology