The recent introduction of large transformer language models – such as GPT3 and LaMDA – has pushed natural language processing (NLP), and the field of artificial intelligence (AI) to impressive heights. While architectures may vary, the common theme of increasing the number of model parameters has repeatedly been shown to lead to unprecedented performance and abilities. However, training these enormous networks is a delicate and expensive endeavor. To stabilize training, one is often required to utilize extremely large batch sizes, which are both computationally expensive and data inefficient.
Another currently important topic is generative adversarial networks (GAN), which are known to handle data inefficiency in interesting ways. Training transformer networks with a GAN setup has previously been disregarded, as this further increases the computational complexity, but thanks to recent theoretical discoveries a resource-efficient GAN-transformer training regime has now been proposed. In theory, such a training setup could stabilize the transformer training, and be both more data and cost efficient.
The intent of this thesis project is to investigate the possibility of training transformer networks using a novel GAN setup. Starting at a small scale we hope to map the potential differences between this setup and the regular training loop, and subsequently scale up to larger and more interesting language tasks.
This thesis will be at the forefront of AI research, covering three concepts:
Familiarity with one or more of these concepts is an advantage.
Location: Kista Start: 2023-01-16
Welcome with your application! If this sounds interesting and you want to know more, please contact Joakim Nivre, +46 10 228 44 44. Last day of application is 18th of November.
Our union representatives are Lazaros Tsantaridis, SACO, 010 516 62 21 and Bertil Svensson, Unionen, 010-516 53 56.Om jobbet Ort
Visstidsanställning 3-6 månaderJob type
Student - examensarbete/praktikKontaktperson
Joakim Nivre +46102284444Referensnummer
Skicka in din ansökan