💡 $\varphi$-Code: A Python Agentic Competitive Programmer Fueled by (tiny) LLMs

$\varphi$-Code (phi-code) is an open-source, agent-based system designed to tackle competitive programming problems. Inspired by projects like AlphaCode, $\varphi$-Code's core philosophy is accessibility: it aims to achieve strong performance using small-to-medium (tiny) Language Models (LLMs), making powerful coding agents runnable even on a consumer-level laptop or desktop PC.

📰 News

The $\varphi$-Code agentic system achieved a significant milestone by successfully answering all questions for the latest LeetCode's Weekly Contest 476 (as of the publication date on 11-17-2025) correctly on the first attempt!

This remarkable feat highlights the effectiveness of $\varphi$-Code's full pipeline, where the Remote Solution Generation produces candidates, the Intelligent Ranker Agent selects the most promising one, and Automated Testing ensures the final submission is robust.

Using the underlying gpt-oss-20B at Q4 model, $\varphi$-Code's successful execution achieved a final contest score of 18, tying the Python score obtained by much larger models like Gemini 2.5 Pro, GPT-5, DeepSeek v3.2, Qwen 3, and Grok 4 in the competition.

You can check the outputs generated by $\varphi$-Code here and view the official LeetCode contest rankings for the LLMs here.

Web Interface

Curses Interface

Terminal Mode

💻 Designed for Consumer Hardware

$\varphi$-Code is built around the principle of resource efficiency.

Tiny LLM Focus: The system leverages compact models like Gemma 3-4B for solution generation, which are manageable on standard consumer GPUs or even modern CPUs via quantization.
LLaMA Server Integration: By using the llama.cpp LLaMA server, $\varphi$-Code can efficiently offload the computationally intensive LLM inference to the best available local hardware with optimized performance.
Efficient Ranker: The ranking component, built on the sentence-transformers library, uses highly efficient embedding models that require minimal resources compared to the generative LLMs.

✨ Features

Accessible & Open-Source: Built with a focus on running powerful agents using less computational resources.
Web-Based Interface: A user-friendly Gradio web application for submitting problem statements and viewing generated solutions.
Curses-Based Interface: A user-friendly curses interface for generating and viewing solutions from the terminal with vim-mode support.
Terminal Mode: Run the tool as a shell command.
Remote Solution Generation: Connects to a remote LLM API (like a llama.cpp server) to generate multiple candidate Python solutions.
Intelligent Ranking (Ranker Agent): Utilizes an embedding model from the sentence-transformers ecosystem to evaluate the feasibility of generated solutions (samples) against the problem statement (anchor).
Automated Testing: Parses example tests from the problem statement, runs the candidate solutions, and sorts them by tests passed and the ranker's confidence score.

📂 Project Structure

.
├── contests_results/                     # Results of phi-code for different contests.
├── datasets/                             # Competitive programming datasets for ranker training.
│   ├── atcoder.jsonl
│   ├── codechef.jsonl
│   └── ...
├── LICENSE
├── ranker/                               # Code for training and managing the sentence-transformers ranker.
│   ├── check_datasets.py
│   ├── filter_datasets.py
│   └── train.py                          # Main ranker training script.
└── solver/                               # Solver coding agent.
    ├── general_prompt.txt
    ├── leetcode_prompt.txt
    ├── leetcode.py                       # LeetCode module.
    ├── utils.py
    ├── main.py                           # The main module.
    ├── web_ui.py                         # Web Interface.
    ├── curses_ui.py                      # Curses Interface.
    ├── terminal.py                       # Run the tool as a shell command.
    └── requirements.txt

🚀 Getting Started

This project is structured into two main components: the solver for running the coding agent and the ranker for training the model that sorts the solutions.

1. Running the Web UI

The core agent functionality is available through the solver/main.py Gradio application.

Prerequisites

A Python environment.
A running LLaMA server (e.g., using llama.cpp's llama-server tool) hosting a non-reasoning LLM.
- Recommended Model: Gemma 3-4B or similar code-centric, compact model.
- Recommended Settings: Temperature of 0.95 and Top-K of 300.

Installation and Execution

Navigate to the solver directory.
Install dependencies:
```
pip install -r requirements.txt
```
Run the main application, providing the necessary server details:
```
python solver/main.py \
  --server <YOUR_SERVER_ADDRESS> \
  --port <YOUR_SERVER_PORT> \
  --site leetcode
```
Note: The current version is primarily focused on LeetCode problems, utilizing leetcode_prompt.txt. This is expected to improve in future updates. Problem statements from other sites won't work.

`main.py` Command Line Options

Option	Description	Example
`-r`, `--ranker`	Path or HuggingFace link to the ranker model (a `sentence-transformers` model).	`Salesforce/SFR-Embedding-Code-2B_R`
`-s`, `--server`	Address of the `llama.cpp` server hosting the LLM.	`http://127.0.0.1`
`-p`, `--port`	Port of the `llama.cpp` server.	`8080`
`-m`, `--site`	From which site are the problem statements	`leetcode`
`-i`, `--interface`	Which interface to use (terminal, web, curses)	`web`
`-f`, `--statement`	Text file with the problem statement to use	`statement.txt`
`-n`, `--number`	Number of solutions to generate	`10`
`-o`, `--output_file`	File to store the generated solutions in jsonl format	`solutions.jsonl`

🧠 Training the Ranker Agent with `sentence-transformers`

The ranker is a crucial component that scores candidate solutions. It is trained as an embedding model to determine how relevant a generated solution is to a given problem statement.

⚠️ Work in Progress: Training a state-of-the-art ranker still requires significant resources. The current solver uses a pre-trained model, but the tools below are provided for those who wish, and have the means, to train their own.

The `ranker` Folder

The ranker folder contains the code for fine-tuning the ranker model using the sentence-transformers library, typically leveraging a Siamese-network architecture for contrastive learning (e.g., Multiple Negative Ranking Loss or Triplet Loss).

train.py: The main script for fine-tuning a ranker model.
check_datasets.py, filter_datasets.py, sample_dataset.py: Utilities for preparing and managing the training data in datasets/.

Training the Ranker

The train.py script allows you to fine-tune an embedding model on competitive programming datasets.

python ranker/train.py \
  --model coldchair16/CPRetriever-Code \
  --epochs 2 \
  --batch-size 8 \
  --leetcode datasets/leetcode.jsonl \
  --codeforces datasets/codeforces.jsonl \
  --output-dir my_trained_ranker

🤝 Contributing

$\varphi$-Code is an open-source effort. We welcome contributions to:

Expand the datasets for ranker training.
Improve the prompt templates (e.g., creating one for Codeforces).
Enhance the problem parsing to extract tests more reliably.

Feel free to open an issue or submit a pull request!

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
contests_results/leetcode_weekly_476		contests_results/leetcode_weekly_476
datasets		datasets
ranker		ranker
solver		solver
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
curses_ui.png		curses_ui.png
terminal_mode.png		terminal_mode.png
webui.png		webui.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

💡 $\varphi$-Code: A Python Agentic Competitive Programmer Fueled by (tiny) LLMs

📰 News

Web Interface

Curses Interface

Terminal Mode

💻 Designed for Consumer Hardware

✨ Features

📂 Project Structure

🚀 Getting Started

1. Running the Web UI

Prerequisites

Installation and Execution

`main.py` Command Line Options

🧠 Training the Ranker Agent with `sentence-transformers`

The `ranker` Folder

Training the Ranker

🤝 Contributing

About

Uh oh!

Releases

Packages

Languages

License

Nan-Do/phi-code

Folders and files

Latest commit

History

Repository files navigation

💡 $\varphi$-Code: A Python Agentic Competitive Programmer Fueled by (tiny) LLMs

📰 News

Web Interface

Curses Interface

Terminal Mode

💻 Designed for Consumer Hardware

✨ Features

📂 Project Structure

🚀 Getting Started

1. Running the Web UI

Prerequisites

Installation and Execution

main.py Command Line Options

🧠 Training the Ranker Agent with sentence-transformers

The ranker Folder

Training the Ranker

🤝 Contributing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

`main.py` Command Line Options

🧠 Training the Ranker Agent with `sentence-transformers`

The `ranker` Folder

Packages