Computational Linguistics [ LINGI2263 ]
5.0 crédits ECTS
30.0 h + 15.0 h
2q
Teacher(s) |
Dupont Pierre ;
Fairon Cédrick ;
|
Language |
English
|
Place of the course |
Louvain-la-Neuve
|
Prerequisites |
-
algorithmics and preferably basic knowledge in machine learning (as provided by SINF1121 and ING2262)
|
Main themes |
- Basics in phonology, morphology, syntax and semantics
- Linguistic resources
- Part-of-speech tagging
- Statistical language modeling (N-grams and Hidden Markov Models)
- Robust parsing techniques, probabilistic context-free grammars
- Linguistics engineering applications such as spell or syntax checking software, POS tagging, document indexing and retrieval, text categorization
|
Aims |
Students completing successfully this course should be able to
-
describe the fundamental concepts of natural language modeling
-
master the methodology of using linguistic resources (corpora, dictionaries, semantic networks, etc) and make an argued choice between various linguistic resources
-
apply in a relevant way statistical language modeling techniques
-
develop linguistic engineering applications
Students will have developed skills and operational methodology. In particular, they have developed their ability to
-
integrate a multidisciplinary approach to the edge between computer science and linguistics, using wisely the terminology and tools of one or the other discipline,
-
manage the time available to complete mini-projects,
-
manipulate and exploit large amounts of data.
|
Evaluation methods |
25% for practical works + 75% final exam (closed book)
No possibility to present again practical works in the second session
|
Teaching methods |
-
12 lectures
-
3 miniprojects
-
feedback sessions about the miniprojects
|
Content |
-
Linguistic essentials: morphology, part-of-speech, phrase structure, semantics and pragmatics
-
Mathematical foundations: formal languages, and elements of information theory
-
Corpus analysis: formating, tokenization, morphology, data tagging
-
N-grams: maximum likelihood estimation and smoothing
-
Hidden Markov Models: definitions, Baum-Welch and Viterbi algorithms
-
Part-of-Speech Tagging
-
Probabilistic Context-Free Grammars: parameter estimation and parsing algorithms, tree banks
-
Machine Translation: classical and statistical methods (IBM models, Phrase-based models), evaluation
-
Applications: SMS predictors, POS taggers, information extraction
|
Bibliography |
Required slides available at:
http://www.icampus.ucl.ac.be/claroline/course/index.php?cid=INGI2263
1 textbook recommended:
|
Cycle et année d'étude |
> Master [120] in Linguistics
> Master [120] in Computer Science and Engineering
> Master [120] in Computer Science
> Master [120] in Statistics: General
|
Faculty or entity in charge |
> INFO
|
<<< Page précédente
|