Automated Code Contract Repair: An LLM-Based Approach

Page content

Context

Design by Contract represents an established, lightweight paradigm for engineering reliable and robust software systems by specifying verifiable expectations and obligations between software components, notably objects in object-oriented programming. For example, OpenJML is an open-source tool for specifying and verifying properties of Java programs using the Java Modeling Language (JML).

Motivation

To mitigate the burden of specifying contracts manually, we recently proposed an AI-based approach to statically generate contracts from source code without any upfront investment or auxiliary data. The results are remarkable, though still far from perfect and actionable in practice. For example, we found that our fine-tuned model produced a substantial amount of code contracts comprising subtle syntactic and semantic errors. Of course, such errors may also occur in hand-crafted contracts.

Goal

The goal of this project is to study the accuracy of automated code repair applied to non-wellformed code contracts written in OpenJML, following an LLM-based approach to code repair. Within the project, you will fine-tune and evaluate a set of publicly available LLMs, benchmarking their performance on the task of code contract repair.

Requirements

Nice-to-have: Some experience in playing with Large Language Models

Essential: You feel confident with programming in both Java and Python You have heard about the principle of Design by Contract

Pointers

OpenJML: https://www.openjml.org/ Hugging Face: https://huggingface.co/ Automatic Program Repair: https://www.duo.uio.no/handle/10852/112424?show=full

Greiner, S., Bühlmann, N., Ohrndorf, M., Tsigkanos, C., Nierstrasz, O., and Kehrer, T. (2024). Automated Generation of Code Contracts - Generative AI to the Rescue? ACM SIGPLAN International Conference on Generative Programming: Concepts & Experiences (GPCE 2024)

Greiner, S., Bühlmann, N., Ohrndorf, M., Tsigkanos, C., Nierstrasz, O., and Kehrer, T. (2024). Automated Generation of Code Contracts - Generative AI to the Rescue? [Data set]. ACM. https://doi.org/10.5281/ZENODO.13351003

Contact

Manuel Ohrndorf, manuel.ohrndorf@unibe.ch Roman Machacek, roman.machacek@unibe.ch