5.00 credits
30.0 h + 15.0 h
Q1
Teacher(s)
Dupont Pierre; Tack Anaïs (compensates Dupont Pierre);
Language
English
> French-friendly
> French-friendly
Main themes
- Various levels of linguistic analysis
- Corpus processing
- Part-of-speech tagging
- Probabilistic language modeling (N-grams and Hidden Markov Models)
- Formal grammars and parsing algorithms
- Machine translation, deep learning
- Linguistics engineering applications such as automatic completion software, POS tagging, parsing or machine translation
Learning outcomes
At the end of this learning unit, the student is able to : | |
1 | Given the learning outcomes of the "Master in Computer Science and Engineering" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes: INFO1.1-3 INFO2.3-4 INFO5.3-5 INFO6.1, INFO6.4 Given the learning outcomes of the "Master [120] in Computer Science" program, this course contributes to the development, acquisition and evaluation of the following learning outcomes: SINF1.M4 SINF2.3-4 SINF5.3-5 SINF6.1, SINF6.4 Students completing successfully this course should be able to
|
Content
- Various levels of linguistic analysis
- (Automated) corpus processing: formating, tokenization, data tagging
- Probabilistic language models: N-grams, HMMs
- Part-of-Speech Tagging
- (Probabilistic) Context-Free Grammars: parameter estimation and parsing algorithms
- Introduction to Machine Translation
- Introduction to Deep Learning
- Typical linguistic applications such as automated completion, POS taggers, parsing or machine translation.
Teaching methods
- Lectures
- Practical projects implemented in Python on the Inginious platform.
Evaluation methods
The projects are worth 30 % of the final grade, 70 % for the final exam (closed-book).
The projects cannot be implemented again in second session.
The project grades are fixed at the end of the semester and included as such in the global score for the second session.
The final exam is, by default, a written exam (on paper or, when appropriate, on a computer).
The projects cannot be implemented again in second session.
The project grades are fixed at the end of the semester and included as such in the global score for the second session.
The final exam is, by default, a written exam (on paper or, when appropriate, on a computer).
Online resources
Bibliography
One recommended textbook - un ouvrage conseillé :
- Speech and Language Processing, D. Jurafsky and J.H. Martin, Prentice Hall.
Teaching materials
- Les supports obligatoires sont constitués de l'ensemble des documents (transparents des cours magistraux, énoncés des travaux pratiques, compléments, ...) disponibles depuis le site Moodle du cours.
- Required teaching material include all documents (lecture slides, project assignments, complements, ...) available from the Moodle website for this course.
Faculty or entity
INFO
Programmes / formations proposant cette unité d'enseignement (UE)
Title of the programme
Sigle
Credits
Prerequisites
Learning outcomes
Master [120] in Data Science : Statistic
Master [120] in Linguistics
Master [120] in Computer Science and Engineering
Master [120] in Computer Science
Master [120] in Data Science Engineering
Master [120] in Data Science: Information Technology