Exploration of Self-Reflective LLMs for Code
- Contact:
- Roman Macháček
Context
Large language models (LLMs) have became popular over the last few years, one of the reason being the quality of the outputs these models generate. Recent advancements try to make models think more, by either utilizing simple prompts or by training them using self-reflection via reinforcement learning.
Motivation
Models such as o1 or DeepSeek-R1 are very recent reasoning models. By spending more time thinking, the model(s) achieve better performance in many tasks involving logical reasoning, including code.
Goal
The student will follow 3 steps:
- Review about Reasoning Models
- Practical direction: Code Repair, Vulnerability Repair, Code Generation
- Experiments and analysis of results including comparison with existing models where the second step is to be discussed further
Requirements
- Knowledge of Machine Learning, PyTorch or TensorFlow
- Passion for learning about state-of-the-art methods and models