Advance notice: ‘Corpus Linguistics with R’ and ‘Statistics for linguistics with R’ bootcamps by S.T. Gries

Louvain-la-Neuve, Belgium, August 2019

The Linguistics Research Unit of the Institute of Language and Communication (Université catholique de Louvain, Belgium) will be hosting two 30-hour bootcamps by Stefan Gries next summer.

The ‘Corpus Linguistics with R’ bootcamp (12-16 Aug 2019) is a hands-on introduction to using the programming language R for the analysis of textual data (mostly corpora, but theoretically also literary works, web data, etc.). It is based on the second edition (2016) of Gries’s textbook Quantitative corpus linguistics with R and introduces a variety of programming constructs required for text processing and corpus exploration including

  • building word frequency lists and computing type-token ratios;
  • computing dispersion and key words statistics;
  • extracting concordance lines.

For that, we will discuss different relevant functions and data structures, control flow structures such as loops and conditionals, and a sizable number of regular expressions; in addition and time permitting, we will also cover very elementary basics of data visualization. The kinds of data dealt with in this course come from a variety of differently formatted/annotated corpora and will also include 1-2 examples of literary works and/or XML processing.

The ‘Statistics for linguistics with R’ bootcamp (19-23 Aug 2019) is a hands-on introduction to statistical methods for both graduate students and seasoned researchers and is based on the second edition (2013) of Gries’s textbook Statistics for linguistics with R. The course is intended for linguists who already have a basic knowledge in statistics and some experience using R, and who wish to improve their proficiency in statistical analysis of linguistic data. Using the open source software and programming language R, we will:

  • briefly recap basic aspects of statistical evaluation as well as several descriptive statistics;
  • briefly discuss a selection of monofactorial statistical tests for frequencies, means, correlations and how they constitute special (limiting) cases of regression methods;
  • explore different kinds of multifactorial and multivariate methods, in particular different kinds of regression approaches (fixed-effects only and mixed-effect modelling) as well as classification trees and random forests.

Details about the previous edition of the ‘Statistics for linguistics with R’ bootcamp in LLN are available at: For info about the prerequisites, visit

The website of the two events will be online in early 2019 and online registration will start on 1 March 2019. It will be possible to register for one event only but priority will be given to people who register for both. The number of participants is limited. If you would like to participate, mark the date in your diary!

Contact email:

Magali Paquot

6th International Conference on Statistical Language and Speech Processing

October 15-16, 2018, Mons, Belgium

Co-organized by:

NUMEDIART Institute, University of Mons
LANGUAGE Institute, University of Mons
Institute for Research Development, Training and Advice (IRDTA), Brussels/London



Monday, October 15

09:00 – 09:30 Registration

09:30 – 09:40 Opening

09:40 – 10:30 Thomas Hain. Crossing Domains in Automatic Speech Recognition – Invited lecture

10:30 – 11:00 Break

11:00 – 12:15

Amal Houidhek, Vincent Colotte, Zied Mnasri and Denis Jouvet. DNN-based Speech Synthesis for Arabic: Modelling and Evaluation

Antoine Perquin, Gwénolé Lecorvé, Damien Lolive and Laurent Amsaleg. Phone-level Embeddings for Unit Selection Speech Synthesis

Raheel Qader, Gwénolé Lecorvé, Damien Lolive and Pascale Sébillot. Disfluency Insertion for Spontaneous TTS: Formalization and Proof of Concept

12:15 – 13:45 Lunch

13:45 – 14:35 Simon King. Does ‘End-to-End’ Speech Synthesis Make any Sense? – Invited lecture

14:35 – 14:50 Break

14:50 – 16:05

George Christodoulides. Forced Alignment of the Phonologie du Français Contemporain Corpus

Ruei Hung Alex Lee and Jyh-Shing Roger Jang. A Syllable Structure Approach to Spoken Language Recognition

Gueorgui Pironkov, Sean Wood, Stéphane Dupont and Thierry Dutoit. Investigating a Hybrid Learning Approach for Robust Automatic Speech Recognition

16:05 – 16:20 Break

16:20 – 17:30 Poster session I

17:30 – 19:30 Touristic visit

Tuesday, October 16

09:00 – 09:50 Isabel Trancoso. Analysing Speech for Clinical Applications – Invited lecture

09:50 – 10:20 Break

10:20 – 11:35

Jan Vanek, Josef Michalek, Jan Zelinka and Josef Psutka. A Comparison of Adaptation Techniques and Recurrent Neural Network Architectures

Andris Varavs and Askars Salimbajevs. Restoring Punctuation and Capitalization Using Transformer Models

David Awad, Caroline Sabty, Mohamed Elmahdy and Slim Abdennadher. Arabic Name Entity Recognition Using Deep Learning

11:35 – 11:50 Break and Group photo

11:50 – 13:05

Pratik Doshi and Wlodek Zadrozny. Movie Genre Detection Using Topological Data Analysis and Simple Discourse Features

Daniel Grießhaber, Thang Vu and Johannes Maucher. Low-resource Text Classification Using Domain-adversarial Learning

Manny Rayner, Johanna Gerlach, Pierrette Bouillon, Nikolaos Tsourakis and Hervé Spechbach. Handling Ellipsis in a Spoken Medical Phraselator

13:05 – 14:35 Lunch

14:35 – 15:50

Laura García-Sardiña, Manex Serras and Arantza Del Pozo. Knowledge Transfer for Active Learning in Textual Anonymisation

Fernando Gomes and Juan Manuel Adán-Coello. Studying the Effects of Text Preprocessing and Ensemble Methods on Sentiment Analysis of Brazilian Portuguese Tweets

Daniel Lichtblau and Catalin Stoean. Text Documents Encoding through Images for Authorship Attribution

15:50 – 16:05 Break

16:05 – 17:05 Poster session II

17:05 – 17:15 Closing

Discourse Perspectives on Technical Communication

03 Jun 2019 – 05 Jun-2019
Leuven, Belgium

The overarching aim of this panel session of DICOEN 2019 is to advance interdisciplinary research in the field of technical communication. More specifically, this session aims to bring together researchers, practitioners and professionals with an interest in discourse aspects of technical communication, addressing the role of language-in-use and the way in which language is embedded in technical communication settings. Unlike other institutional contexts such as politics, the media, the workplace, healthcare etc. (for other “real-world contexts”, see e.g. Tannen et al. 2018), the study of technical communication discourse has so far received little attention. This is somewhat surprising in view of our highly technologized society and the increasing importance of communicating effectively about technology in order to bridge the gap between users and (the functionalities of) technical products.
Discourse analysis encompasses a broad range of theories, topics and approaches for explaining language-in-use. In line with Bloor and Bloor (2015), we understand discourse as “symbolic human interaction in its many forms”, whether through spoken or written language or via non-linguistic resources such as image, symbol, sound, and gesture. We welcome contributions that address various discourse aspects in technical communication settings. Contributions may focus on a range of linguistic (grammatical, semantic, pragmatic, stylistic, rhetorical, conversational, narrative, intercultural, critical, cognitive discourse) and non-linguistic phenomena that may be used to examine the relationship between form and function in any technical communication genre across the product life cycle (e.g. instructions for use, technical procedures, warning notices, FAQs, training documents, …). For example, contributions may focus on how language is used to communicate and interact in technical communication contexts or on how semiotic modes such as text, speech, image, symbol, graphics, and sound interact in technical communication outputs. Given that discourse does not only refer to actual ‘text’ but may also incorporate the whole communicative act involving production and comprehension, viz. “peoples’ actions, interactions, values, beliefs, and uses of objects, tools and environments within social or institutional settings” (Gee 2011: 181), contributions may also address matters such as context, background information, conventions, or other shared knowledge between the writer and his (increasingly multicultural or international) audience (Bloor and Bloor 2015), hence widening the scope from micro to macro levels of discourse.

Theme session organizers:
Parthena Charalampidou (Aristotle University of Thessaloniki) and Birgitta Meex (KU Leuven) <>

Call deadline:
If you are interested in participating, please send a provisional title or topic proposal by 8 October. Abstracts of 400 words maximum are due by 15 November 2018, if the theme session is accepted.

The Language of Recruiting: Caught between Persuading and Gatekeeping

03 Jun 2019 – 05 Jun-2019
Leuven, Belgium

Recruitment in professional contexts occurs through many different communicative genres, ranging from job ads (e.g. Verwaeren et al., 2017) over CV’s and cover letters (e.g. Waung et al., 2017) to job interviews (e.g. Timming, 2017) and assessments. Focusing on the perspective of the recruiter, an interesting tension is seen in these genres between the need to persuade well-suited candidates to apply for the vacant position and the need for gatekeeping, viz. the need to prohibit unsuited candidates access to the position and the firm.

Although the job ad can be seen as typically tailored to persuading (see e.g. van Meurs et al. 2015) and job interviews are primarily known for their gatekeeping function (Kerekes 2007), both genres nevertheless portray a notable tension between the two communicative goals. As previous research reveals, this tension can be uncovered through quantitative and qualitative analysis of the language variants and varieties used in these genres. For instance, as van Meurs (2010) and Zenner et al. (2013) discuss, gatekeeping can occur in job ads through the use of English as language of communication, restricting the position to applicants who master the language. Additionally, as e.g. Van de Mieroop & Schnurr (2018) and Roberts & Sarangin (1999) discuss, job interviews are hybrid activity types, where more institutionally oriented discourse types (foregrounding the exchange of information) and more relational discourse types (foregrounding personal information) occur. Both discourse types reveal a tension between gatekeeping (only candidates that fit in both in terms of skills and in terms of personality will be considered) and persuading (candidate’s that actually fit in need to be convinced that the firm is a place they want to work).

Theme session organizers:

Eline Zenner (KU Leuven) and Frank van Meurs (Radboud University Nijmegen)


Call for Papers:

This theme session aims to contribute to this line of research that studies the mechanisms for persuading and gatekeeping in the language of recruitment. Contributions ideally pay specific attention to the tension between the two communicative goals described above (see  »Session Description »).


– Submission of abstracts (400 words) to theme session organizers by 24 September
– Notification of acceptance from theme session organizers by 30 September
– Submission of theme session proposal by theme session organizers on 30 September
– If the theme session is accepted: submission of individual abstracts by presenters by 15 November

Failing Identities: Identification and Resistance

20-21 September 2018
University of Liège, Belgium

Dear colleagues,

Our research unit ‘Langues et Lettres’ proudly presents the international conference on “Failing Identities: Identification and Resistance”, which will take place in Liège 20-21 September 2018. The conference programme can be viewed at

Registration for the conference is still open at, and will close 12
September 2018.
More information is available at the conference website at

Looking forward to seeing you there,
On behalf on the organizing committee,
An Van linden

PhD-Fellowship in descriptive linguistics (Spanish) at the University of Leuven (KU Leuven)

Applications are invited for a PhD fellowship in descriptive linguistics (Spanish) starting November 2018 at the Department of Linguistics of the University of Leuven.

The aim of this doctoral project is to describe a relatively recent phenomenon involving the spreading of the conditional form into new territories, semantically and pragmatically, in the variety of Spanish spoken in and around the Río de la Plata Basin. The project will focus on the analysis of structures such as Estarían necesitando una caja de Rivotril’They need a box of Rivotril’ (lit. they would be needing of box of Rivotril), which do not receive a plain conjecture, prospective or attenuation interpretation, but which parody the rumouror ‘journalistic’ conditional in order to express speaker’s subjective stance or judgement. The starting hypothesis of this project is that this recent constructional change follows a cline of sujectification from less subjective over more subjective to intersubjective. The dialogal and dialogic nature of language and the social context in which the construction is used takecenter stage.The phenomena to be analysedinclude morphosyntax, semantics-pragmatics, prosody, and discourse functions. More information about the project can be requested by e-mail.

Applicants are expected to have the following qualifications and attitudes:

  • an MA in theoretical linguistics (or related disciplines), with academic distinction;
  • expertise and (near-)native competence in Spanish and English;
  • a good background in the domains of linguistics relevant to the topic;
  • strong analytical skills and the motivation to pursue descriptively innovative work in linguistics;
  • demonstrable skill in academic writing;
  • willingness to undertake the study of prosody;
  • willingness to acquire the necessary statistical know-how for the analysis of the data;
  • the ability to present research results at international conferences;
  • a cooperative attitude and the capacity to actively participate in activities and in events of the research group and department.

Motivated candidates meeting these criteria are invited to apply online according to the instructions given at

Please include a cover letter, a CV, the names of up to three referees, and a sample piece of academic writing (either the cover letter or the sample piece of writing should be in English). The deadline for applications is 15September2018.

Interviews will be held with shortlisted candidates on 25-26 September(in person or online for overseas applicants). A decision will be communicated by the end of September.

The fellow will receive a salary of approximately 2000 EUR/month (after taxes) for a period of two yearsduring which s/he will be expected to apply for external funding. S/he will also receive all regular provisions for PhD fellows at the University of Leuven. In addition, s/he will be allocated office space and a laptop, and receive funding for research activities like attending conferences abroad and organising workshops and conferences in Leuven or elsewhere.

The successful candidate may be asked to provide (limited) assistance with teaching, student supervision and data management. S/he will join the dynamic research group FunC (Functional and Cognitive linguistics: Grammar and Typology) in the Department of Linguistics.

For further information, you can contact Prof. María Sol Sansiñena  <>.

Starting date: November 2018

End date: October 2020

Brussels Conference in Generative Linguistics

10-Dec-2018 – 11-Dec-2018, Brussels, Belgium

CRISSP is proud to present the eleventh installment of the Brussels Conference on Generative Linguistics (BCGL), devoted to the syntax and semantics of aspect.

We are pleased to announce that the following invited speakers have agreed to give a talk at BCGL 11:

Berit Gehrke (Humboldt Universität, Berlin)
Roumyana Pancheva (University of Southern California)
Gillian Ramchand (The Arctic University of Norway, Tromsø)

Workshop Description:

The properties and representations of aspect have been studied extensively from both syntactic and semantic perspectives, as well as their interfaces. As for the syntax, a central question is how aspectual notions such as telicity, duration, cause and change are represented in syntax. Approaches range from the minimalist structure of Erteschik-shir & Rapoport (2005), to a more fine-grained functional structure as proposed by Ramchand (2008), or with a clear differentiation between outer (external, presentational) and inner (internal, Aktionsart) aspect, as proposed by Travis (2010). The semantics of aspect has also been widely studied. As in the syntax, a distinction is often made between outer and inner aspect, with tense scoping over grammatical (outer) aspect, and grammatical aspect scoping over aspectual class (inner aspect). This layered structure makes it possible to investigate (cross-linguistic variation in) the interaction between the lexical features of the verb, the semantics of the predicate-argument structure, the expression of progressive and perfective/imperfective aspect, and other elements in the sentence which can carry aspectual information (e.g. certain adverbs/adverbial phrases, negation). The aim of this workshop is to explore these and related issues.

2nd Call for Papers:

The submission deadline for abstracts is September 15, 2018.

Abstract Guidelines:

Abstracts should not exceed two pages, including data, references and diagrams. Abstracts should be typed in at least 11-point font, with one-inch margins (letter-size; 8½ inch by 11 inch or A4) and a maximum of 50 lines of text per page. Abstracts must be anonymous and submissions are limited to 2 per author, at least one of which is co-authored. Only electronic submissions will be accepted.

Please submit your abstract using the EasyChair link for BCGL11:

Web Site:

PhD fellowship in Corpus Linguistics and Second Language Acquisition at UCLouvain

The Centre for English Corpus Linguistics has an opening for a PhD fellowship for a total period of four years, starting as of October 2018 (later is also a possibility).

The position is part of the UCLouvain FSR-funded research project Particle placement and genitive alternations in EFL learner spoken syntax: core probabilistic grammar and/or L1specific preferences?(Promotor: Dr. Magali Paquot). The project stems from collaborative work between the promotor, Prof. B. Szmrecsanyi (KU Leuven) and Dr. J. Grafmiller (University of Birmingham) (e.g. Paquot, Grafmiller & Szmrecsanyi (2017)).

The PhD student will investigate the extent to which English as a Foreign Language (EFL) learners share a core probabilistic grammar (cf. Bresnan, 2007) with users of first and second language varieties of English by analyzing variation in grammatical constraints on the particle placement alternation (for transitive phrasal verbs) and the genitive alternation in corpora of EFL learner spoken language. Methodologically, the candidate will build on annotation guidelines developed by Szmrecsanyi, Grafmiller and colleagues to describe the predictors that may influence speakers’ choice governing the alternations; s/he will also be expected to use a range of variationist analysis techniques.

Job description:

The research project is a joint venture between the Centre for English Corpus Linguistics (CECL) at the UCLouvain and the Quantitative Lexicology and Variational Linguistics (QLVL) group at the KU Leuven.The candidate will be affiliated to the Institut Langage et Communication (ILC, UCLouvain) and will also prepare a joint UCLouvain-KU Leuven PhD in Linguistics.

Activities that the candidate will perform include:

  • develop and implement (i) theoretical concepts in line with the focus of the research project and (ii) appropriate methodological procedures for investigating these concepts;
  • conduct corpus-based analyses of L1 and L2 writing and spoken samples;
  • interpret the results of the analyses and report on the project in conference presentations and academic publications;
  • carry out a research stay at the University of Birmingham (to work in close collaboration with Dr. J. Grafmiller too);
  • by the end of the four-year term, submit and defend a PhD dissertation based on the project.

Requirements and profile:

  • Master degree in Linguistics, Applied Linguistics, Language & Literature, Natural Language Processing or in Language Learning and Teaching;
  • excellent record of BA and MA level study;
  • excellent command of English.
  • excellent and demonstrated analytic skills;
  • ability to work with common software packages (including MS Word, Excel and PowerPoint);
  • basic knowledge of corpus-linguistic techniques is a requirement
  • knowledge of statistics and statistical software is an asset;
  • programming skills in Perl, Python or R are also an asset;
  • excellent and demonstrated self-management skills, ability and willingness to work in a team;
  • willingness to live in or near Louvain-la-Neuve and to travel abroad (for short-term research stays and to attend international academic conferences).


Terms of employment:

  • The contract will initially be for one year, three times renewable, with a total of four years.
  • The candidate receives a doctoral fellowship grant (starting at approx. EUR 1900 net per month) and full medical insurance.
  • The candidate will be expected to apply for a FNRS position after the first year.
  • The position requires residence in Belgium.
  • Applicants from outside the EU are responsible for obtaining the necessary visa or permits, with the assistance of UCLouvain staff department.

Application Deadline: Review of applications will begin on 20 August 2018, and continue until the position is filled

Please include with your application:

  • a cover letter in English, in which you specify why you are interested in this position and how you meet the job requirements outlined above;
  • a curriculum vitae in English;
  • a concise academic statement in English in which you outline your expectations about and plans for graduate study and career goals;
  • a copy of BA and MA diplomas and degrees;
  • a copy of your master thesis and academic publications (if applicable);
  • the names and full contact details of two academic referees.

Shortlisted candidates will be invited for an interview (in situ or via video conferencing) in September 2018 (or later).

Applications (as an email attachment) and inquiries should be addressed to:

Dr. Magali Paquot

Centre for English Corpus Linguistics

Université Catholique de Louvain



PhD-Fellowship in historical linguistics (Spanish) at the University of Leuven

A new research project at the Department of Linguistics of the University of Leuven is looking for applicants for a fully funded four-year PhD fellowship in Spanish historical linguistics (starting date: October/November 2018).

The PhD fellowship is part of a larger collaborative project entitled  »Beyond the clause: Encoding and inference in clause combining », funded by the Research Council of the University of Leuven. The team heading the project is composed of Bert Cornillie, Kristin Davidse, Elwys De Stefani and Jean-Christophe Verstraete. The supervisors of this fellowship are Bert Cornillie and Malte Rosemeyer.
The aim of the PhD project is to analyse the diachronic development of que- deletion in Spanish complement constructions, as in, e.g., Por ende vos rogamos le dedes entera fee y creencia (anon. 1497) lit.: ‘therefore we ask you you give him total faith and belief’. The project will focus on the origin, extension and demise of que-deletion, in interaction with developments in clause-internal marking (especially mood marking and subject expression), as well as contact influence (contact with Latin, Discourse Traditions). The project will examine whether que-deletion is part of a more general process of restructuring of the complementation system from medieval Castilian to modern Spanish. More information about the project can be requested by e-mail to one of the supervisors (see below).

Applications are invited from candidates with the following qualifications:
– An MA in linguistics or philology (or equivalent, e.g. BA Hons), with academic distinction
– Strong analytical skills and the motivation to pursue creative work in linguistics
– A (near)native command of Spanish and a good command of English
– Experience with historical corpus linguistics, preferably in Spanish
– Willingness to acquire the necessary statistical know-how for the analysis of the data
– A cooperative attitude and the capacity to actively participate in project activities (e.g. in data sessions) and in events of the research units involved in the project (including their seminar series)
– The ability to present research results at international conferences, and to publish in peer-reviewed journals

Motivated candidates meeting these criteria are invited to apply online at the application web address below.

Please include a cover letter, a CV, the names of up to three referees we may contact, and a sample piece of academic writing. The deadline for applications is 1 July 2018.
Interviews will be held with shortlisted candidates on 9-10 July (in person or online for overseas applicants). A decision will be communicated by mid-July.

The fellow will receive a salary of approximately 2000 EUR/month (after taxes) for a period of four years and all regular provisions for PhD fellows at the University of Leuven. In addition, s/he will be allocated office space and a laptop, and receive funding for research activities like attending conferences abroad and organizing workshops and conferences in Leuven or elsewhere. For more information on the application process, working conditions and career opportunities as a PhD fellow at the University of Leuven see
The successful candidate may be asked to provide (limited) assistance with teaching, student supervision and data management. S/he will join a dynamic research team (faculty members, postdoctoral fellows and PhD students).

For further information, please contact one of the supervisors using the contact information below.
Starting date: October/November 2018
End date: September 2022

Applications Deadline: 01-Jul-2018

Web Address for Applications:

Contact Information:
Cornillie , Bert

Rosemeyer, Malte

PhD Position « Analogy in language change », KULeuven

A fully funded four-year PhD position is offered as part of the project  »Making the invisible uninvisible: the role of analogy in language change ». The project seeks to use corpus evidence to gain better insight into the workings of analogy as a mechanism of grammatical change. To this end, it will focus specifically on the history of a small subsystem of the grammar of English, adnumeral markers. The project will be supervised by Prof. Hendrik De Smet, in collaboration with Graeme Trousdale (University of Edinburgh) and Peter Petré (University of Antwerp). The project is to be carried out within the Leuven-based research group ‘Functional and Cognitive Linguistics: Grammar & Typology’ (FunC), which is part of the Department of Linguistics. More information on the project and eligibility requirements can be found through the link below.

Applications Deadline: 01-Jul-2018

Web Address for Applications:

Contact Information:
Prof. Hendrik De Smet
Phone:+32 16 32 47 72