🤖 AI Summary
This study addresses the lack of transparency and auditability in prompt usage when large language models are deployed in the public sector, a critical gap unaddressed by existing governance mechanisms. The work proposes Prompt Commons—a novel framework that treats prompts as governable public resources—featuring a version-controlled, community-maintained repository of prompt templates enriched with provenance metadata, licensing terms, and audit logs. It introduces three governance states: open, curated, and vetoable. Through a deliberative integration process, the framework synthesizes input from diverse stakeholders into compromise prompts, supported by collaborative data collection, prompt augmentation, and metadata annotation. Empirical validation in a major North American city demonstrates feasibility, scaling 443 original prompts to 3,317 enhanced variants. The paper further articulates falsifiable hypotheses on governance impacts and outlines an evaluation agenda.
📝 Abstract
This paper argues that prompts used to deploy large language models (LLMs) in public-sector settings should be treated as governed artefacts rather than private, transient inputs. Prompts encode role instructions, decision framings, and value claims; prompt choice can materially shift outputs even when model weights and input records are held fixed. Existing governance tools, including model and dataset documentation, organisation-level policies, and post-training alignment, rarely make the local prompt collections used in deployment transparent, contestable, or auditable. We propose Prompt Commons: a versioned, community-maintained repository of prompt templates with provenance metadata, licensing, and moderation logs. Using a pilot dataset collected with community partners in a large North American city (443 human prompts; 3,317 after augmentation), we illustrate three governance states (open, curated, veto-enabled) and a negotiation-oriented ensemble method that aggregates stakeholder prompts into compromise recommendations. We close with falsifiable implications and an evaluation agenda for prompt-layer governance.