Self-Harness: Harnesses That Improve Themselves

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the critical dependency of large language model (LLM) agents on their interaction harnesses, which are typically handcrafted and struggle to keep pace with the rapid evolution and diverse behaviors of modern LLMs. To overcome this limitation, the paper introduces the Self-Harness paradigm—the first approach enabling LLM agents to autonomously optimize their own harnesses without external intervention. Through a closed-loop iterative process involving weakness identification, harness rewriting, and regression validation, the method leverages execution trace analysis, failure mode recognition, and minimal-edit generation to refine harnesses precisely. Evaluated on Terminal-Bench-2.0, base models from MiniMax, Qwen, and GLM achieve task success rates of 61.9%, 38.1%, and 57.1%, respectively—substantial improvements over their original rates of 40.5%, 23.8%, and 42.9%—with enhancements specifically targeting each model’s unique weaknesses rather than relying on generic instruction augmentation.

📝 Abstract

The performance of LLM-based agents is jointly shaped by their base models and the harnesses that mediate their interaction with the environment. Because different models exhibit distinct behaviors, effective harness design is inherently model-specific. Yet agent harnesses are still largely engineered by human experts, a paradigm that scales poorly as modern LLMs become increasingly diverse and rapidly evolving. In this paper, we introduce Self-Harness, a new paradigm in which an LLM-based agent improves its own operating harness, without relying on human engineers or stronger external agents. We operationalize Self-Harness as an iterative loop with three stages: Weakness Mining, which identifies model-specific failure patterns from execution traces; Harness Proposal, which generates diverse yet minimal harness modifications tied to these failures; and Proposal Validation, which accepts candidate edits only after regression testing. We instantiate Self-Harness on Terminal-Bench-2.0 using a minimal initial harness and three base models from diverse families: MiniMax M2.5, Qwen3.5-35B-A3B, and GLM-5. Across all three models, Self-Harness consistently improves performance, with held-out pass rates increasing from 40.5% to 61.9%, 23.8% to 38.1%, and 42.9% to 57.1%, respectively. Qualitative analyses further show that Self-Harness does not simply add generic instructions, but effectively turns model-specific weaknesses into concrete, executable harness changes. These results suggest a path toward LLM-based agents that are not merely shaped by their harnesses, but can also participate in reshaping them.

Problem

Research questions and friction points this paper is trying to address.

LLM-based agents

harness design

model-specific behavior

scalability

human engineering

Innovation

Methods, ideas, or system contributions that make the work stand out.

Self-Harness

LLM-based agents

harness optimization