Generative AI has hit the public imagination in full force during 2022. Perhaps the biggest splash was made by OpenAI’s launch of the text-to-image generator DALL-E 2 with its stunning illustrations.
Under the guise of generative AI art, code-completion programs are boosting developer productivity by automating repetitive and mundane programming tasks. The world’s largest source code host GitHub released its code-completion tool called Copilot in June 2022. It is trained on 45 terabytes of coding data from the GitHub code repository and runs on OpenAI’s Codex model. The company claims the tool can automate around 40 per cent of a developer’s tasks.
‘Copilot has a lot of potential, but also a risk of creating over-reliance on AI-generated code, if the user doesn't check its outputs properly,’ says Juho Leinonen, postdoctoral researcher at Aalto University’s Department of Computer Science, whose research is focused on educational technology and AI in education. ‘The risk is particularly high for students and beginners.’
GitHub made the tool freely available for students from the get-go, in contrast to the 10-euro monthly subscription price for developers. The decision raised questions about whether students could use the tool for cheating.
‘The question we should be asking is how do we rethink computing education to incorporate these tools?’ says Leinonen. ‘This is just the beginning.’
Leinonen and his colleagues took up the challenge and studied the question from a teacher’s perspective in their recent research article on Codex.
‘We examined how teachers can use Codex to automatically create new programming exercises and natural language explanations for code,’ says Arto Hellas, senior university lecturer at the Department of Computer Science. ‘Regardless of some small issues, our exploration with Codex showed remarkable results in creating novel exercises and code explanations. We were even able to generate exercises around a specific topic, like basketball.’
Among beginners and students, the need for programming exercises is huge. Being able to generate sufficiently accurate exercises and natural language explanations would be a major help for teachers. The research article received the best paper award at the ICER 2022 conference, which is the main research conference for the computing education community.
‘Although the generated exercises and explanations need to be verified by a human, the performance surpassed the research community’s expectations,’ says Leinonen. ‘The researchers and teachers in our field whom we’ve spoken with, are excited about the opportunities yet anxious about the challenges such as over-reliance.’
The results didn’t go unnoticed by GitHub either. The company decided to make Copilot free for teachers shortly after the ICER conference, referencing the team’s research in its announcement.
The impact of large language models on computing education is still an emerging topic and research can be hard to come by. Leinonen, Hellas and their colleague Sami Sarsa have been busy with answering all the collaboration requests that their award-winning article generated. Moreover, the trio is digging deeper into the possibilities that the new frontier may hold.
‘Due to the rapid development of the field, it is hard to predict exactly how large language models will reshape computing education,’ says Leinonen. ‘Two things are for certain – AI-assisted coding is here to stay and we’re here to figure it out.’