Exploration of Self-Reflective LLMs for Code

Context

Large language models (LLMs) have became popular over the last few years, one of the reason being the quality of the outputs these models generate. Recent advancements try to make models think more, by either utilizing simple prompts or by training them using self-reflection via reinforcement learning.

Motivation

Models such as o1 or DeepSeek-R1 are very recent reasoning models. By spending more time thinking, the model(s) achieve better performance in many tasks involving logical reasoning, including code.

Goal

The student will follow 3 steps:

  1. Review about Reasoning Models
  2. Practical direction: Code Repair, Vulnerability Repair, Code Generation
  3. Experiments and analysis of results including comparison with existing models where the second step is to be discussed further

Requirements

Pointers