2 PhD positions in Multilingual Natural Language Processing

Katholieke Universiteit Leuven
September 20, 2023
(ref. BAP-2023-534)

Last modification : Monday, July 31, 2023

The HCI unit within the department of Computer Science at KU Leuven welcomes applications for two fully funded, 4-year PhD positions in Natural Language Processing. The HCI unit focuses on how people interact with information, relying on language technology (including text and multimedia mining), visualisation and computer graphics. Within HCI, the NLP group focuses on multilingual NLP and interpretability.

Multilingual NLP is booming. This is due in no small part to language models trained on large amounts of multilingual data (such as mBERT and XLM-R) which have been found to have surprising cross-lingual transfer capabilities, in spite of receiving no cross-lingual supervision. Despite this progress, these models only cover a fraction of the world's languages, with large inequalities in performance.

The project in which the two PhD students will be working aims to address this inequality. We view it as an issue in fairness where different languages are not treated equally. We will look at the field of fairness in AI to identify the biases that are responsible for this inequality. We will develop methods to address these biases combining ideas from algorithmic fairness, neuro- symbolic AI and computer vision in order to make NLP fairer with respect to typological diversity. Example approaches to be further developed:

  • Algorithmic fairness: de Lhoneux et al 2022 (https: // aclanthology.org/2022.acl-short.64/), we used a worst-case aware sampling method to balance training data from typologically diverse languages.

  • Neuro-symbolic AI: Üstün et al 2020 (https: // aclanthology.org/2020.emnlp- main.180/), they use typological features to inform a dependency parser.

  • Computer vision: Rust et al 2023 (https: // openreview.net/forum?id=FkSp8VW8RjH), we treat language modelling as reconstructing pixels (the model is known as PIXEL).

  • The PhD candidates will collaborate with the group of Desmond Elliott (University of Copenhagen) on further developing PIXEL and evaluating it in multilingual settings.

  • The ideal candidate has a master degree in Computer Science or Artificial Intelligence with experience in NLP. Experience with multilingual NLP or NLP for non English/Indo-European languages is a plus.

  • Candidates must have good programming skills, ideally in Python. Familiarity with pytorch is a plus.

  • Candidates must be proficient in oral and written English.

  • The PhD students must be willing to work independently as well as in a team. They will work on research projects that involve frequent interactions with both internal and external researchers.

  • Offer
  • A high-level and exciting international research environment

  • The opportunity to build up research and innovation skills that are essential for a future career in research and development, both in an industrial and academic context

  • A competitive salary (see working conditions → salary below) and travel funding

  • As a PhD student in the Department of Computer Science, you will get involved in TAing for bachelor or master-level courses, as well as other smaller duties such as proctoring during exams, helping out with smallish PR events, ... which are shared and distributed among all academic staff of the department. These duties will not exceed 20% of your working time.

    All applications submitted before the 8th of September 2023 will be given full consideration. After that, the positions will remain open until filled.

    The starting date will be agreed upon between candidate and supervisor, but the aim is to start as soon as possible after the 1st of October 2023.


    For more information, please contact Prof. dr. Miryam de Lhoneux, mail: [email protected] website: https: // people.cs.kuleuven.be/~miryam.delhoneux/

    /! Note that she is on parental leave until the deadline and may be slow to reply but she will make sure all emails are answered before the deadline.

    Your application should include

  • a detailed CV,

  • a letter of motivation with reference to your interest in multilingual NLP,

  • transcripts with grades of MSc/BSc courses,

  • names and contact details of two references.

  • KU Leuven seeks to foster an environment where all talents can flourish, regardless of gender, age, cultural background, nationality or impairments. If you have any questions relating to accessibility or support, please contact us at [email protected].

