<- Archives UCL - Programme d'études ->



Computational Linguistics [ LINGI2263 ]


5.0 crédits ECTS  30.0 h + 15.0 h   2q 

Teacher(s) Dupont Pierre ; Fairon Cédrick ;
Language English
Place
of the course
Louvain-la-Neuve
Prerequisites
  • algorithmics and preferably basic knowledge in machine learning (as provided by SINF1121 and ING2262)
Main themes
  • Basics in phonology, morphology, syntax and semantics
  • Linguistic resources
  • Part-of-speech tagging
  • Statistical language modeling (N-grams and Hidden Markov Models)
  • Robust parsing techniques, probabilistic context-free grammars
  • Linguistics engineering applications such as spell or syntax checking software, POS tagging, document indexing and retrieval, text categorization
Aims

Students completing successfully this course should be able to

  • describe the fundamental concepts of natural language modeling
  • master the methodology of using linguistic resources (corpora, dictionaries, semantic networks, etc) and make an argued choice between various linguistic resources
  • apply in a relevant way statistical language modeling techniques
  • develop linguistic engineering applications

Students will have developed skills and operational methodology. In particular, they have developed their ability to

  • integrate a multidisciplinary approach to the edge between computer science and linguistics, using wisely the terminology and tools of one or the other discipline,
  • manage the time available to complete mini-projects,
  • manipulate and exploit large amounts of data.
Evaluation methods

25% for practical works + 75% final exam (closed book)

No possibility to present again practical works in the second session

Teaching methods
  • 12 lectures
  • 3 miniprojects
  • feedback sessions about the miniprojects
Content
  • Linguistic essentials: morphology, part-of-speech, phrase structure, semantics and pragmatics
  • Mathematical foundations: formal languages, and elements of information theory
  • Corpus analysis: formating, tokenization, morphology, data tagging
  • N-grams: maximum likelihood estimation and smoothing
  • Hidden Markov Models: definitions, Baum-Welch and Viterbi algorithms
  • Part-of-Speech Tagging
  • Probabilistic Context-Free Grammars: parameter estimation and parsing algorithms, tree banks
  • Machine Translation: classical and statistical methods (IBM models, Phrase-based models), evaluation
  • Applications: SMS predictors, POS taggers, information extraction
Bibliography

Required slides available at:
http://www.icampus.ucl.ac.be/claroline/course/index.php?cid=INGI2263

1 textbook recommended:

 

Cycle et année
d'étude
> Master [120] in Linguistics
> Master [120] in Computer Science and Engineering
> Master [120] in Computer Science
> Master [120] in Statistics: General
Faculty or entity
in charge
> INFO


<<< Page précédente