/
The course introduces the main tenets of corpus linguistics and the methods and techniques used to work with large collections of spoken or written electronic data.
It covers the following topics: corpus design:
- data collection, archiving and markup.
- corpus typology: spoken and written corpora; monolingual vs multilingual; native vs learner; diachronic vs synchronic.
- major electronic corpora: British National Corpus, International Corpus of English, International Corpus of Learner English, MICASE, Louvain International Database of Spoken English Interlanguage, etc.
- corpus annotation (POS-tagging, lemmatization, parsing, semantic tagging, prosodic annotation, error tagging).
- automated analysis of lexis, grammar and discourse.
Special attention is paid to the links between corpus linguistics and foreign language learning, contrastive and translation studies and natural language processing.
By the end of the course, students are expected to have a solid theoretical background in corpus linguistics and master the main techniques and tools used to analyse spoken and written computerized data. They will be able to read the scientific literature and conduct their own research in the field.
The contribution of this Teaching Unit to the development and command of the skills and learning outcomes of the programme(s) can be accessed at the end of this sheet, in the section entitled “Programmes/courses offering this Teaching Unit”.
During the term: one or several written assignments involving the analysis of corpus data and counting for 20% of the final grade. Students who have not submitted their assignments in time will not be allowed to register for the exam.
In case of second exam enrolment, the students who have not obtained at least 10/20 for the assignment(s) will have to do it/them again.
In January or September : written exam counting for 80% of the final grade.
A WORD OF CAUTION : enrolment for the written exam is subject to the following condition: the students must have handed in the written assignments on time to be allowed to enrol.
The course relies partly on required readings that the students are expected to do before class and that will lead to discussions (in class or online) in which the students should be ready to participate fully. Several hands-on sessions will be organized to familiarize students with text handling software tools.
The course provides a theoretical and practical introduction to corpus linguistics. It presents the main concepts related to corpus linguistics, as well as some of its possible applications in different fields.
- Kennedy, G. (1998) An Introduction to Corpus Linguistics. Longman: Harlow.
- McEnery, T., Xiao, R. & Tono, Y. (2006) Corpus-based Language Studies. An advanced resource book, Routledge.
- Sélection d'articles scientifiques
/