White Paper: Repo Release Bot

Repo Release Bot架构白皮书

Dec 29, 2025

Design Theorems (设计定理)

0) Design Theorems (Fix These First)

These four points are not “recommendations,” but provable constraints that allow RRB to survive for ten years. They confine model uncertainty to the Proposal Zone, lock system sovereignty writes into the Deterministic Zone, and contain side effects within the Ticket Zone. If any one of them is violated, RRB degrades into ordinary CI/CD: it can run, but ten years later it cannot answer “why.”

0.1 RRB Does Not Judge, and Is Not “Smart”: It Only Translates and Orchestrates

(RRB = Translation + Orchestration, not Judgment)

RRB’s sole mission is to translate human / institutional responsibility declarations (Intent) into auditable, recomputable, executable structures, and to bind execution strictly to tickets produced by institutional judgment.

It does not decide whether to release, nor does it optimize release strategies. It only takes an action request that has an explicitly declared responsibility, passes it through a deterministic compilation pipeline and institutional gates, turns it into a DecisionRecord that can enter history, and executes side effects only when permitted.

Core implications:

Judgment authority belongs to institutions (Policy Gate), not to the bot. The bot only delivers inputs into the institution and persists outputs according to protocol.
Intelligence is allowed only in the Proposal Zone: explanations, drafts, candidate changes may be smart. Once inside the sovereign zone, only deterministic flows are permitted (validate / canonicalize / hash / policy / evidence / write).
RRB’s correctness is not measured by “release success rate,” but by:
1. Every historical write has a responsible subject;
2. Every decision is recomputable;
3. Every side effect is traceable to a ticket.

In short: RRB’s value lies not in automation, but in accountable automation.

0.2 Triggers Must Be Intent: Intent ≠ Event

(Trigger Is Responsibility, Not Activity)

RRB must be triggered by Intent (a responsibility declaration), not by any form of Event (an occurrence).

Events can be endless: push, merge, CI green, tests passing, issue closed, time elapsed… These can only serve as evidence inputs, never as sovereignty advancement.

Why this is mandatory:

Events are facts, not responsibility; they say “something happened,” not “it should happen,” “who bears the consequences,” or “how failure is explained.”
Intent is an institutional statement: it upgrades “I want to do this” into “I am willing to take responsibility for advancing history,” explicitly specifying the responsible subject, scope, risk acceptance, rollback plan, and so on.
Once events are allowed to trigger sovereign writes, the system enters the dangerous state you described:
The system advances its own history silently (no declaration, no responsibility, no explanation).

Hard constraints:

Intent must originate from humans or institutions, never from agents or LLMs.
All events may only populate the evidence field or act as optional policy inputs, but must never become triggers.
“All green” is not a prerequisite for release; at most, it is one execution condition, not a justification for advancing history.

In one sentence: Events explain the world; Intent changes the world. RRB recognizes only the latter as a legitimate starting point.

0.3 No LLMs in the Sovereign Zone:

Policy / Scheduler / Ledger / Canonical Memory Must Be Model-Free and Recomputable

(Sovereign Zone = Deterministic)

RRB must enforce a strict system boundary:

LLMs may exist only in the Proposal Zone. Any component that results in historical facts or execution permissionmust be deterministic, reproducible, and recomputable.

The Sovereign Zone includes:

Policy Gate — institutional judgments (ALLOW / DENY / REQUIRE_OVERRIDE)
Scheduler — timing, execution windows, retry strategies, freeze policies
Ledger / DecisionRecord / OverrideRecord — append-only institutional facts
Canonical Memory — canonicalized source inputs for recomputation

All of these must satisfy a single invariant:

Given the same canonical inputs and the same policy_version (+ hash), the system must produce the same decision, and must be able to reproduce the full evidence chain for that decision in the future.

Why LLMs must never enter:

LLM outputs are inherently non-deterministic: the same prompt can yield different outputs; more critically, they introduce unprovable external state (model versions, temperature, vendor updates, implicit context).
Once an LLM enters Policy / Scheduler / Ledger, replay collapses: you can only narrate what happened, not recompute why that judgment was made.
The goal of the sovereign zone is not to be smarter, but to be more accountable. Intelligence is a variable; institutions are constants.

Mandatory recording rules:

Every ledger entry must include origin and replayable=true/false.
All LLM outputs default to replayable=false and may only be proposals or annotations, never gate inputs.

In one sentence: The sovereign zone should behave like a compiler, not a chatbot.

0.4 Execution Must Be Ticketized:

The Executor Recognizes Only DecisionTickets

(Ticket-Gated Execution)

RRB’s execution layer must be designed so that without a ticket, nothing can be executed.

The Executor must not accept “raw inputs” (such as tag names, version numbers, or commands). It accepts only a DecisionTicket (or a decision_id → ticket lookup), and performs strict consistency checks before execution.

What a Ticket Is:

A Ticket is an institution-issued portable authorization, binding execution side effects to:
- decision == ALLOW
- request_hash (hash of the canonical request)
- policy_version + policy_hash
- decision_id (pointer to the persisted record)
Tickets enforce a critical property:
Institutional facts are written before side effects are allowed
(record-before-side-effects)
Thus, even failures do not “evaporate”: DENY and OVERRIDE become auditable history.

Execution-layer hard constraints (must be enforced in code, not by convention):

no ticket → no exec
ticket.decision != ALLOW → no exec
ticket.request_hash != computed_request_hash → no exec
ticket.policy != current_policy_ref → no exec
Optional but recommended: revoked / expired → no exec

Institutional meaning:

Side effects are not a “program capability,” but an institutional authorization.
Ticketization fully separates “can it be done” from “should it be done”:
- Policy decides should
- Tickets carry proof of authorization
- Executors merely act on proof

In one sentence: Build the execution layer like access-control hardware, not like an obedient person.

Summary: The System Properties of the Four Theorems Together

If these four constraints are locked in, RRB gains a hard, provable long-term property:

The advancement of system history can be triggered only by responsible subjects; institutional judgments are recomputable; side effects are traceable; failures do not evaporate.

This is the engineering form of the statement you are aiming for:

A mature intelligent system is not mature because it is smart, but because ten years later it can still answer: “Why did we do this back then?”

0) 设计定理（先钉死）

这四条不是“建议”，而是 RRB 之所以能活十年的可证明约束：它们把“模型的不确定性”关在 Proposal 区，把“系统的主权写入”锁进 Deterministic 区，把“副作用”锁进 Ticket 区。任何一条被破坏，RRB 就会退化成普通 CI/CD：能跑，但无法在十年后回答“为什么”。

0.1 RRB 不裁决、不聪明：只做翻译与执行（RRB = Translation + Orchestration, not Judgment）

RRB 的唯一使命是把 人类/制度的责任声明（Intent） 翻译成 可审计、可重算、可执行 的结构，并把执行动作严格绑定到 制度产生的票据 上。

它不“决定要不要 release”，也不“优化 release 策略”；它只负责把一个声明过责任的动作请求，经过一套确定性的编译链与制度门禁，转化为可以进入历史的 DecisionRecord，并在被允许时执行副作用。

核心含义：

裁决权属于制度（Policy Gate），不属于 bot。bot 只是把 inputs 送进制度，把 outputs 按协议落盘。
聪明只能出现在 Proposal Zone：解释、草案、候选变更可以聪明；进入主权区后，只允许确定性流程（validate/canonicalize/hash/policy/evidence/write）。
RRB 的正确性不是“release 成功率”，而是：
1. 所有历史写入都有责任主体；
2. 所有决策可重算；
3. 所有副作用可追溯到一张票据。

一句话：RRB 的价值不在“自动化”，而在“可追责的自动化”。

0.2 触发必须是 Intent：Intent ≠ Event（Trigger is Responsibility, not Activity）

RRB 的触发器必须是 Intent（责任声明），而不是任何形式的 Event（活动发生）。

Event 可以无限多：push、merge、CI green、测试通过、issue close、时间到了……这些都只能作为“证据输入”，不能作为“主权推进”。

为什么必须如此：

Event 是事实，不携带责任；它只说明“发生过”，不说明“应该发生、谁承担后果、失败如何解释”。
Intent 是制度语句：它把“我要做”升级为“我愿意为这次推进历史负责”，并明确责任主体、范围、风险接受方式、回滚计划等。
一旦允许 Event 触发主权写入，系统就会出现你说的那种危险：
系统在不知不觉中“自我推进历史”（无人声明、无人负责、无人能解释）。

硬约束：

Intent 必须来自 human / institution，不得来自 agent/LLM。
所有 Event 只能进入 evidence 字段或作为 policy 的可选输入，但 不能直接变成 trigger。
“全绿”不是 release 的前提，它最多是 执行条件的一部分，而不是“是否应当推进历史”的理由。

一句话：Event 解释世界；Intent 改变世界。RRB 只承认后者作为合法起点。

0.3 主权区禁止 LLM：Policy/Scheduler/Ledger/Canonical Memory 必须无模型可重算（Sovereign Zone = Deterministic）

RRB 的系统边界必须明确分区：

LLM 只能在 Proposal Zone；凡是会导致“历史事实”或“执行许可”的位置，必须是确定性、可复现、可重算的（deterministic, replayable, hashable）。

主权区（Sovereign Zone）定义：

Policy Gate：制度判断（ALLOW / DENY / REQUIRE_OVERRIDE）
Scheduler：何时执行、执行窗、重试策略、冻结策略
Ledger / DecisionRecord / OverrideRecord：制度事实写入（append-only）
Canonical Memory：决定重算的真源输入（canonicalized artifacts）

这些区域必须满足同一条性质：

给定同一组 canonical inputs + 同一 policy_version(+hash)，系统必须产生同一 decision；并且能在未来复现该 decision 的证据链。

为什么 LLM 绝对不能进入：

LLM 输出天然不可控：同 prompt 也可能不同输出；更关键的是它引入了 不可证明的外部状态（模型版本、温度、供应商更新、隐式上下文）。
一旦 LLM 进入 Policy/Scheduler/Ledger，你的 replay 立刻失效：你只能“复述当年发生过”，无法“重算当年为什么那样裁决”。
主权区的目标不是“更聪明”，而是“更可问责”。聪明是波动项，制度是常量。

必须显式记录：

每条进入 Ledger 的记录都要带 origin 与 replayable=true/false。
LLM 产物默认 replayable=false，只能作为提案或注释，不得作为 gate input。

一句话：主权区要像编译器，不像聊天机器人。

0.4 执行必须票据化：Executor 只认 DecisionTicket（ticket-gated execution）

RRB 的执行层必须被设计成“没有票据就什么都做不了”。

Executor 不接受“裸输入”（比如直接给它 tag 名称、版本号、命令）；它只接受一张 DecisionTicket（或 decision_id → ticket lookup），并在执行前做严格一致性校验。

Ticket 的本质：

Ticket 是制度输出的“可携带通行证”，把一次执行副作用绑定到：
- decision == ALLOW
- request_hash（对 canonical request 的 hash）
- policy_version + policy_hash
- decision_id（落盘记录的指针）
Ticket 的存在让系统具备一个关键性质：
先写入制度事实，再允许发生副作用（record-before-side-effects）。
这样失败也不会“蒸发”，DENY/OVERRIDE 都成为可审计历史。

执行层的硬约束（必须实现为代码，而不是约定）：

no ticket → no exec
ticket.decision != ALLOW → no exec
ticket.request_hash != computed_request_hash → no exec
ticket.policy != current_policy_ref → no exec
可选但推荐：revoked/expired → no exec

制度含义：

副作用不是“程序能力”，而是“制度授权”。
票据化把“能不能做”与“应不应该做”彻底分离：
- Policy 决定“应不应该”
- Ticket 承载“被授权的证明”
- Executor 只负责“按证明做”

一句话：把执行层做成“门禁硬件”，而不是“听话的人”。

总结：四条定理合在一起的系统性质

如果你把这四条钉死，RRB 会出现一个很硬的、可证明的长期性质：

系统的“历史推进”只能由责任主体触发；制度判断可重算；副作用可追溯；失败不蒸发。

这正是你想要的那句话的工程化版本：

一个成熟的智能系统，不是因为它聪明，而是因为它能在十年后回答：“我们当年为什么这样做。”

Data Objects 数据对象

2) Data Objects (The Four-Piece Set)

This four-piece set is RRB’s minimal closed loop for the institutional pipeline. It upgrades a release from an “operation” to an accountable institutional event:
Intent provides the responsible subject, Request provides adjudicable input, Ticket provides execution authorization, and Record provides historical fact and the true source for replay.
Missing any one of them causes the system to degenerate at some stage into an unauditable script.

2.1 ReleaseIntent v1 (Responsibility Declaration, Legitimate Trigger)

Definition: A ReleaseIntent is a declaration of responsibility to “advance the world into a new state.” It is not a requirement, not a suggestion, and not an event description. It is a subject’s commitment to future side effects: I request the system to perform a release, and I accept responsibility for explaining its consequences.

Hard constraints:

Allowed origins only: human | institution (must be identifiable and accountable)
Forbidden origins: agent | LLM (models may propose, but may not trigger history)
Write mode: append-only JSONL — every intent is an immutable fact (even revocations must be expressed via a new record)

Artifacts:

runtime_data/release_intents.jsonl
One line per intent, serving as the true trigger source. Any subsequent governance flow related to a release must be traceable back to a specific intent.

Semantic essentials:

The core fields of an intent are not “what to do,” but:
1. Who declares it; 2) Why; 3) How risk is accepted; 4) Rollback / loss-containment strategy; 5) Scope boundaries.
An intent may be brief, but it must satisfy ten-year accountability: it must be able to answer who advanced history at the time and on what basis they accepted risk.

2.2 CapabilityRequest: `repo.release` (Adjudicable Request)

Definition: A CapabilityRequest is the product of turning an Intent into adjudicable input. It does not carry responsibility (that resides in the Intent); it carries a structured request that the Policy Gate can deterministically judge.

Key property: It must be canonicalizable. The same request—regardless of field order or formatting noise—must yield the same canonical representation and the same request_hash.

Hard constraints:

Must pass a deterministic pipeline:
schema_validate → canonicalize → request_hash
The Policy Gate may accept only the canonical form (or its hash plus evidence references). Passing raw high-entropy text directly to the gate is prohibited.

Artifacts:

Enters the Capability Bus (the single legal entry for capability=repo.release)
request_hash becomes the primary key across the entire pipeline:
- Anchor for DecisionRecord
- Binding point for Ticket
- Validation base for Executor
- Stable input fingerprint for Replay

Semantic essentials:

A Request is institutional language, not human language: fields must be directly referencable, comparable, and decidable by policy.
A Request must reference its Intent (e.g., intent_id or intent_hash); otherwise, adjudication would occur without a responsible subject.

2.3 DecisionTicket (Execution Pass)

Definition: A DecisionTicket is the Policy Gate’s authorization credential, turning the result of an adjudicable request into a portable execution permit. Its existence guarantees that side effects can occur only after institutional facts have been written to history.

Minimal field semantics (the hard-core skeleton you listed):

decision_id: pointer to the unique institutional judgment record
decision: ALLOW | DENY | REQUIRE_OVERRIDE
request_hash: binds to the adjudicated canonical request
policy_version(+hash): binds to the policy version that produced the judgment (without it, replay fails)
timestamp: issuance time (temporal anchor of the institutional fact)
actor: who issued / endorsed it (human or institutional subject)

Hard constraints:

Executor recognizes tickets only: no raw parameters, no CLI flags, no environment variables.
Tickets must pass consistency checks:
decision == ALLOW AND request_hash matches AND policy_ref matches
(optionally also revoked / expiry)
Tickets are not about “execution convenience,” but about blocking all governance bypass paths.

Artifact relationship:

A Ticket is essentially a portable execution proof derived from a DecisionRecord, but it cannot replace the Record:
The Record is the institutional fact; the Ticket is its execution projection.

2.4 DecisionRecord / OverrideRecord

(Institutional Facts, Append-Only, True Source for Replay)

Definition: A Record is an institutional fact the system has taken responsibility for over time. It is not a log, but a constitution-level historical entry. All RRB replay, auditing, and regression are anchored to it.

DecisionRecord (judgment fact) must include:

decision_id (primary key)
request_hash (fingerprint of the adjudicated object)
policy_version + policy_hash (policy basis of the judgment)
decision (ALLOW / DENY / REQUIRE_OVERRIDE)
timestamp
actor (institutional subject)
(strongly recommended) evidence_refs: references to evidence inputs (CI results, diff summaries, risk flags, etc.). These are references—not live external state fetching by the gate.

OverrideRecord (override fact) must express:

An override does not rewrite old records; it appends a higher-authority institutional fact:
override_id / target_decision_id / by / reason / risk_acceptance / timestamp / scope
Overrides are expensive actions and must be able to trigger subsequent policy review (an institutional self-correction loop).

Hard constraints:

Append-only: never overwrite, never rewrite. Revocation or correction must be expressed via new records (revoke / replace / invalidate).
Record-before-side-effects: write the DecisionRecord (or override fact) first; only then may the executor perform any side effects.
Replay definition: replay = recompute the original judgment using Records + canonical inputs + policy_ref; it is not rerunning the workflow and not refetching external state.

Artifacts:

runtime_data/decisions.jsonl, runtime_data/overrides.jsonl
(or unified under runtime_data/governance/*.jsonl)
Together, they constitute RRB’s ten-year explainability.

The Causal Closed Loop of the Four Pieces (One Sentence)

Intent: who takes responsibility for advancing history
Request: convert responsibility into adjudicable input (canonical + hash)
Record: write the judgment as an institutional fact (append-only, replayable)
Ticket: project the fact into execution authorization (ticket-gated, bypass-proof)

2) 数据对象（四件套）

这四件套是 RRB 的“制度流水线最小闭环”。它们把一次 release 从“操作”升级为“可追责的制度事件”：Intent 给出责任主体，Request 给出可裁决输入，Ticket 给出执行许可，Record 给出历史事实与 replay 真源。缺任何一个，系统都会在某个环节退化为不可审计的脚本。

2.1 ReleaseIntent v1（责任声明，合法触发）

定义：ReleaseIntent 是“把世界推进到一个新状态”的责任声明。它不是需求、不是建议、不是事件描述，而是一个主体对未来副作用的承诺：我要求系统进行一次 release，并承担其后果解释责任。

硬约束：

来源只允许：human | institution（必须可识别、可追责）
禁止来源：agent | LLM（模型只能提案，不能触发历史）
写入方式：append-only JSONL，每条 intent 都是一条不可覆盖的事实（即使后续撤销，也必须用新记录表达撤销/替代）

产物：

runtime_data/release_intents.jsonl
每条一行，作为“触发真源”。后续任何 release 相关的制度流程都必须能指回某条 intent。

语义要点：

Intent 的核心字段不是“做什么”，而是：
1. 谁声明；2) 为什么；3) 风险如何接受；4) 回滚/止损策略；5) 范围边界。
Intent 可以很短，但必须满足“十年后可问责”：能回答“当时是谁让系统推进历史、他/它依据什么承担风险”。

2.2 CapabilityRequest: `repo.release`（可裁决请求）

定义：CapabilityRequest 是“把 Intent 变成可裁决输入”的产物。它不携带责任（责任在 Intent），它携带的是 Policy Gate 能够确定性判定 的结构化请求。

关键性质：它必须是 canonicalizable 的：同一个请求，无论字段顺序、格式噪声如何，都必须得到同一个 canonical 表示与同一个 request_hash。

硬约束：

必须先经过 deterministic pipeline：
schema_validate → canonicalize → request_hash
Policy Gate 的输入只能是 canonical form（或其 hash + 证据引用），禁止把“原始高熵文本”直接交给 gate。

产物：

进入 Capability Bus（capability=repo.release 的单一合法入口）
request_hash 成为贯穿全链路的主键：
- 写入 DecisionRecord 的锚点
- Ticket 的绑定点
- Executor 校验的基座
- Replay 的稳定输入指纹

语义要点：

Request 是“制度语言”，不是“人类语言”：字段必须能被 policy 直接引用、比较、判定。
Request 必须包含对 Intent 的引用（例如 intent_id 或 intent_hash），否则会出现“无责任主体的裁决”。

2.3 DecisionTicket（执行通行证）

定义：DecisionTicket 是 Policy Gate 的“授权凭证”，把一次可裁决请求的结果变成一张可携带的执行许可。它的存在保证：副作用只能发生在制度已经写入历史之后。

最小字段语义（你列的这组就是硬核骨架）：

decision_id：指向那条制度判断的唯一记录
decision：ALLOW | DENY | REQUIRE_OVERRIDE
request_hash：绑定到被裁决的 canonical request
policy_version(+hash)：绑定到产生该裁决的制度版本（没有它 replay 失效）
timestamp：签发时间（制度事实的时间锚）
actor：谁在签发/背书（human/institution 的制度主体）

硬约束：

Executor 只认 Ticket：不认裸参数、不认 CLI flags、不认环境变量。
Ticket 必须做一致性校验：
decision==ALLOW 且 request_hash 匹配且 policy_ref 匹配（以及可选 revoked/expiry）
Ticket 不是“优化执行便利性”的对象，而是“堵死绕过治理路径”的对象。

产物关系：

Ticket 本质上是从 DecisionRecord 派生的“可携带执行证明”，但它不能替代 Record：
Record 是制度事实，Ticket 是制度事实的执行投影。

2.4 DecisionRecord / OverrideRecord（制度事实，append-only，replay 真源）

定义：Record 是系统在时间中对世界承担过的制度事实：它不是日志（log），而是“宪法意义上的历史条目”。RRB 的 replay/审计/回归全都以它为真源。

DecisionRecord（裁决事实）必须包含：

decision_id（主键）
request_hash（裁决对象指纹）
policy_version + policy_hash（裁决依据的制度版本）
decision（ALLOW/DENY/REQUIRE_OVERRIDE）
timestamp
actor（制度主体）
（强烈建议）evidence_refs：引用哪些证据输入（CI 结果、diff 摘要、risk flags…），注意是引用，不是让 gate 去“现场抓取外部状态”。

OverrideRecord（越权事实）必须表达：

override 不是“改写旧记录”，而是新增一条更高权限的制度事实：
override_id / target_decision_id / by / reason / risk_acceptance / timestamp / scope
Override 是昂贵行为：必须能触发后续 policy review（制度自我修正回路）

硬约束：

append-only：永不覆盖、永不 rewrite；如果要撤销/纠错，用新记录表达“撤销/替代/作废”。
record-before-side-effects：先落盘 DecisionRecord（或 override 事实），再允许执行器做任何副作用。
replay 的定义：replay = 以 Record + canonical inputs + policy_ref 重算当年的判断；不是重跑流程、不是重新抓外部状态。

产物：

runtime_data/decisions.jsonl、runtime_data/overrides.jsonl（或你统一放到 runtime_data/governance/*.jsonl）
它们共同构成：RRB 的“十年可解释性”。

四件套的因果闭环（一句话把链条钉死）

Intent：谁在承担推进历史的责任
Request：把责任转成可裁决输入（canonical + hash）
Record：把裁决写成制度事实（append-only，可 replay）
Ticket：把事实投影成执行许可（ticket-gated，堵死绕过）

3) Main Flow (From Intent to History, Then Optional Execution)

The goal of the main flow is not “to get a release done,” but to turn a release into an institutionally explainable historical entry: before any side effects occur, a recomputable adjudicative fact must be produced; every adjudicative fact must point back to a responsible subject; every execution must be ticketed, revocable, and replayable. Intelligence is allowed in the flow—but it must never cross the sovereign-zone firewall.

3.1 Phase Structure: Three Zones, Five Steps

(Proposal / Sovereign / Execution)

Proposal Zone (LLM allowed): transform Intent into candidate plans and candidate structures (drafts, explanations, diff summaries, risk lists)
Sovereign Zone (LLM forbidden): deterministic compilation + institutional adjudication + append-only fact writing
Execution Zone (LLM forbidden): ticket-gated side-effect execution (tag / release / deploy), optionally writing execution results to an event ledger

3.2 End-to-End Flow (High-Density Version)

(1) Intent Declared: Responsibility Enters the System (The Only Legitimate Trigger)

Input: ReleaseIntent v1 (human / institution)

Actions (deterministic):

schema validation
canonicalization (optional but recommended: canonicalize intent + compute intent_hash)
append-only write to runtime_data/release_intents.jsonl

Invariants:

Without intent, no adjudication chain may proceed (no intent → no history)
Intents are immutable; revocation must be expressed via a new intent or record

(2) Proposal Build: Generate Candidate Requests (LLM Only Here)

Input: intent_id / intent_hash + read-only snapshot of current repo state

Actions (LLM allowed; outputs are always “proposals”):

generate a draft CapabilityRequest (repo.release)
generate explanatory materials (why release, risks, rollback, change summary)
mark outputs with origin=model, replayable=false

Artifacts:

proposal artifacts (may be persisted to runtime_data/proposals/*.json or printed to stdout)
Note: proposals never enter the ledger or policy inputs unless extracted into structured fields by the deterministic compiler

Invariants:

LLM outputs are candidates only; they must never directly trigger the bus, the ledger, or the execution gate

(3) Compile & Validate: Deterministic Compilation

(Turning Proposals into Adjudicable Requests)

Input: proposal drafts (or structured input provided directly by a human)

Actions (deterministic; the “compiler” before entering the sovereign zone):

schema validation: repo_release_request_v1.json
canonicalization: produce canonical request
compute request_hash = sha256(canonical_request)
construct CapabilityRequest (including intent_ref + request_hash + evidence_refs)

Artifacts:

canonical request (optional persistence: runtime_data/canonical/requests/<request_hash>.json)
CapabilityRequest enters the Capability Bus

Invariants:

request_hash must depend solely on the canonical request (no implicit external state)
Recompiling the same request at any time must yield the same hash (hash stability)

(4) Capability Bus → Policy Gate: Single Legal Entry for Adjudication

(Institutional Judgment)

Input: CapabilityRequest (repo.release) + policy_ref (version + hash)

Actions (deterministic):

Capability Bus: normalize entry, package context, invoke policy gate
Policy Gate: produce a three-state judgment:
- ALLOW: institution permits issuing an execution ticket
- DENY: institution explicitly refuses (refusal is also history)
- REQUIRE_OVERRIDE: escalation to higher-authority signing / override flow

Artifacts:

DecisionRecord (append-only, and written before any side effects)
Optional: derived DecisionTicket (only when decision == ALLOW, or if you allow non-executing tickets)

Invariants:

Policies must be versioned (policy_version + policy_hash), or replay is invalid
The gate must not read implicit external state (time, env vars, network) unless explicitly included as inputs

(5) Ledger Append: Write Institutional Facts First, Then Allow Side Effects

(Record-Before-Effects)

Input: adjudication result + request_hash + policy_ref + actor + evidence_refs

Actions (deterministic):

append to runtime_data/decisions.jsonl
if overridden: append to runtime_data/overrides.jsonl (never overwrite prior decisions)

Artifacts:

DecisionRecord / OverrideRecord (true source for replay)
(if ALLOW) DecisionTicket (execution projection)

Invariants:

Without a record, execution is forbidden (no record → no side effects)
Overrides must not rewrite old records; only append new facts (append-only governance)

(6) Ticket-Gated Execution: Authorized Execution

(Side-Effect Zone)

Input: DecisionTicket or decision_id → ticket lookup

Actions (deterministic):

verify decision == ALLOW
verify ticket.request_hash == recompute_hash(canonical_request)
verify ticket.policy_ref == current_policy_ref (or record.policy_ref)
verify ticket not revoked / not expired (optional)
execute side effects: git tag / GitHub release / deploy (plugin-based)

Artifacts:

execution result events (recommended separate ledger: runtime_data/executions.jsonl)
failures must also be recorded (failure does not evaporate; it is an asset)

Invariants:

Executors never accept raw inputs (parameter bypass equals governance bypass)
Tickets are the sole execution authorization (ticket-gated execution)

(7) Replay / Regression: Recompute Judgments from Records

(Not Re-running the Workflow)

Input: DecisionRecord + corresponding canonical inputs + policy_ref

Actions (deterministic):

recompute the policy decision
compare recomputed result with the recorded decision
produce regression reports (for institutional evolution and anti-regression)

Invariants:

Replay targets “recomputing the judgment made at the time,” not “rerunning past side effects”
Policy changes must imply version changes; otherwise differences cannot be explained

3.3 Two Critical “Institutional Locks”

(What Makes the Main Flow Non-Bypassable)

Single-Entry Lock: all repo.release adjudication requests must pass through the Capability Bus (single choke point).
Side-Effect Lock: all side effects must pass ticket verification (single gate).

With these two locks plus an append-only ledger, RRB truly transitions from a “bot” into an organ.

3.4 Minimal v0 Execution Strategy

(Ship Immediately Without Taking Risks)

v0: implement Intent → DecisionRecord (no automatic tagging, no deployment)
v0.1: issue tickets only on ALLOW + ticket-gated tagging
v0.2: plugin-based release/deploy + execution events ledger
At every step, replay must remain functional: stabilize the sovereign zone first, then grow features

Main Flow (主流程)

3) Main Flow (From Intent to History, Then Optional Execution)

The goal of the main flow is not “to get a release done,” but to turn a release into an institutionally explainable historical entry: before any side effects occur, a recomputable adjudicative fact must be produced; every adjudicative fact must point back to a responsible subject; every execution must be ticketed, revocable, and replayable. Intelligence is allowed in the flow—but it must never cross the sovereign-zone firewall.

3.1 Phase Structure: Three Zones, Five Steps

(Proposal / Sovereign / Execution)

Proposal Zone (LLM allowed): transform Intent into candidate plans and candidate structures (drafts, explanations, diff summaries, risk lists)
Sovereign Zone (LLM forbidden): deterministic compilation + institutional adjudication + append-only fact writing
Execution Zone (LLM forbidden): ticket-gated side-effect execution (tag / release / deploy), optionally writing execution results to an event ledger

3.2 End-to-End Flow (High-Density Version)

(1) Intent Declared: Responsibility Enters the System (The Only Legitimate Trigger)

Input: ReleaseIntent v1 (human / institution)

Actions (deterministic):

schema validation
canonicalization (optional but recommended: canonicalize intent + compute intent_hash)
append-only write to runtime_data/release_intents.jsonl

Invariants:

Without intent, no adjudication chain may proceed (no intent → no history)
Intents are immutable; revocation must be expressed via a new intent or record

(2) Proposal Build: Generate Candidate Requests (LLM Only Here)

Input: intent_id / intent_hash + read-only snapshot of current repo state

Actions (LLM allowed; outputs are always “proposals”):

generate a draft CapabilityRequest (repo.release)
generate explanatory materials (why release, risks, rollback, change summary)
mark outputs with origin=model, replayable=false

Artifacts:

proposal artifacts (may be persisted to runtime_data/proposals/*.json or printed to stdout)
Note: proposals never enter the ledger or policy inputs unless extracted into structured fields by the deterministic compiler

Invariants:

LLM outputs are candidates only; they must never directly trigger the bus, the ledger, or the execution gate

(3) Compile & Validate: Deterministic Compilation

(Turning Proposals into Adjudicable Requests)

Input: proposal drafts (or structured input provided directly by a human)

Actions (deterministic; the “compiler” before entering the sovereign zone):

schema validation: repo_release_request_v1.json
canonicalization: produce canonical request
compute request_hash = sha256(canonical_request)
construct CapabilityRequest (including intent_ref + request_hash + evidence_refs)

Artifacts:

canonical request (optional persistence: runtime_data/canonical/requests/<request_hash>.json)
CapabilityRequest enters the Capability Bus

Invariants:

request_hash must depend solely on the canonical request (no implicit external state)
Recompiling the same request at any time must yield the same hash (hash stability)

(4) Capability Bus → Policy Gate: Single Legal Entry for Adjudication

(Institutional Judgment)

Input: CapabilityRequest (repo.release) + policy_ref (version + hash)

Actions (deterministic):

Capability Bus: normalize entry, package context, invoke policy gate
Policy Gate: produce a three-state judgment:
- ALLOW: institution permits issuing an execution ticket
- DENY: institution explicitly refuses (refusal is also history)
- REQUIRE_OVERRIDE: escalation to higher-authority signing / override flow

Artifacts:

DecisionRecord (append-only, and written before any side effects)
Optional: derived DecisionTicket (only when decision == ALLOW, or if you allow non-executing tickets)

Invariants:

Policies must be versioned (policy_version + policy_hash), or replay is invalid
The gate must not read implicit external state (time, env vars, network) unless explicitly included as inputs

(5) Ledger Append: Write Institutional Facts First, Then Allow Side Effects

(Record-Before-Effects)

Input: adjudication result + request_hash + policy_ref + actor + evidence_refs

Actions (deterministic):

append to runtime_data/decisions.jsonl
if overridden: append to runtime_data/overrides.jsonl (never overwrite prior decisions)

Artifacts:

DecisionRecord / OverrideRecord (true source for replay)
(if ALLOW) DecisionTicket (execution projection)

Invariants:

Without a record, execution is forbidden (no record → no side effects)
Overrides must not rewrite old records; only append new facts (append-only governance)

(6) Ticket-Gated Execution: Authorized Execution

(Side-Effect Zone)

Input: DecisionTicket or decision_id → ticket lookup

Actions (deterministic):

verify decision == ALLOW
verify ticket.request_hash == recompute_hash(canonical_request)
verify ticket.policy_ref == current_policy_ref (or record.policy_ref)
verify ticket not revoked / not expired (optional)
execute side effects: git tag / GitHub release / deploy (plugin-based)

Artifacts:

execution result events (recommended separate ledger: runtime_data/executions.jsonl)
failures must also be recorded (failure does not evaporate; it is an asset)

Invariants:

Executors never accept raw inputs (parameter bypass equals governance bypass)
Tickets are the sole execution authorization (ticket-gated execution)

(7) Replay / Regression: Recompute Judgments from Records

(Not Re-running the Workflow)

Input: DecisionRecord + corresponding canonical inputs + policy_ref

Actions (deterministic):

recompute the policy decision
compare recomputed result with the recorded decision
produce regression reports (for institutional evolution and anti-regression)

Invariants:

Replay targets “recomputing the judgment made at the time,” not “rerunning past side effects”
Policy changes must imply version changes; otherwise differences cannot be explained

3.3 Two Critical “Institutional Locks”

(What Makes the Main Flow Non-Bypassable)

Single-Entry Lock: all repo.release adjudication requests must pass through the Capability Bus (single choke point).
Side-Effect Lock: all side effects must pass ticket verification (single gate).

With these two locks plus an append-only ledger, RRB truly transitions from a “bot” into an organ.

3.4 Minimal v0 Execution Strategy

(Ship Immediately Without Taking Risks)

v0: implement Intent → DecisionRecord (no automatic tagging, no deployment)
v0.1: issue tickets only on ALLOW + ticket-gated tagging
v0.2: plugin-based release/deploy + execution events ledger
At every step, replay must remain functional: stabilize the sovereign zone first, then grow features

3) 主流程（从 Intent 到历史，再到可选执行）

主流程的目标不是“把 release 做出来”，而是把一次 release 变成制度可解释的历史条目：任何副作用之前必须先产生可重算的裁决事实；任何裁决事实必须能指回责任主体；任何执行必须被票据化、可撤销、可回放。流程里允许聪明，但聪明永远不得跨过主权区防火墙。

3.1 阶段划分：三域五步（Proposal / Sovereign / Execution）

Proposal Zone（可用 LLM）：把 Intent 变成候选方案与候选结构（草案、解释、diff 摘要、风险点清单）
Sovereign Zone（禁止 LLM）：确定性编译 + 制度裁决 + append-only 事实写入
Execution Zone（禁止 LLM）：票据化执行副作用（tag/release/deploy），并把执行结果写入事件账本（可选）

3.2 端到端流程（高密度版）

(1) Intent Declared：责任声明入账（唯一合法触发）

输入：ReleaseIntent v1（human/institution）

动作（deterministic）：

schema validate
canonicalize（可选但推荐：intent 也 canonicalize + intent_hash）
append-only 写入 runtime_data/release_intents.jsonl

不变量：

没有 intent，不得进入后续任何裁决链路（no intent → no history）
intent 不可覆盖；撤销必须用新 intent/记录表达

(2) Proposal Build：生成候选请求（LLM only here）

输入：intent_id/intent_hash + repo 当前状态快照（只读）

动作（允许 LLM，但输出永远是“提案”）：

生成候选 CapabilityRequest(repo.release) 草案
生成“解释材料”（为什么 release、风险、回滚、变更摘要）
标注 origin=model、replayable=false

产物：

proposal artifacts（可落盘到 runtime_data/proposals/*.json 或仅 stdout）
注意：proposal 不进 ledger、不进 policy 输入（除非被 deterministic 编译器抽取成结构字段）

不变量：

LLM 产物只能作为候选；不得直接触发 bus / ledger / execution gate

(3) Compile & Validate：确定性编译（把提案变成可裁决请求）

输入：proposal 草案（或 human 直接给的结构）

动作（deterministic，主权区入口前的“编译器”）：

schema validate：repo_release_request_v1.json
canonicalize：生成 canonical request
compute request_hash = sha256(canonical_request)
生成 CapabilityRequest（含 intent_ref + request_hash + evidence_refs）

产物：

canonical request（可落盘：runtime_data/canonical/requests/<request_hash>.json）
CapabilityRequest 进入 Capability Bus

不变量：

request_hash 必须只依赖 canonical request（禁止隐式外部状态）
同 request 任何时间重编译都必须得到同 hash（hash stability）

(4) Capability Bus → Policy Gate：单一合法入口裁决（制度判断）

输入：CapabilityRequest(repo.release) + policy_ref(version+hash)

动作（deterministic）：

Bus 负责：标准化入口、打包上下文、调用 policy gate
Policy Gate 负责：输出三态裁决
- ALLOW：制度允许签发执行票据
- DENY：制度明确拒绝（拒绝也是历史）
- REQUIRE_OVERRIDE：必须走更高权限签字/越权流程

产物：

DecisionRecord（必须 append-only，且在任何副作用之前写入）
可选：派生 DecisionTicket（仅当 decision==ALLOW 或你允许签发“非执行票据”）

不变量：

policy 必须版本化（policy_version + policy_hash）；否则 replay 不成立
gate 禁止读取隐式外部状态（时间/环境变量/网络）除非显式纳入输入

(5) Ledger Append：先写制度事实，再允许副作用（record-before-effects）

输入：裁决结果 + request_hash + policy_ref + actor + evidence_refs

动作（deterministic）：

append runtime_data/decisions.jsonl
如有 override：append runtime_data/overrides.jsonl（永不覆盖旧决策）

产物：

DecisionRecord / OverrideRecord（replay 真源）
（若 ALLOW）DecisionTicket（执行投影）

不变量：

没有 record，不得执行（no record → no side-effects）
override 不得改写旧 record，只能追加新事实（append-only governance）

(6) Ticket-Gated Execution：票据化执行（副作用区）

输入：DecisionTicket 或 decision_id -> ticket lookup

动作（deterministic）：

校验：decision==ALLOW
校验：ticket.request_hash == recompute_hash(canonical_request)
校验：ticket.policy_ref == current_policy_ref（或 record.policy_ref）
校验：ticket 未 revoked/未过期（可选）
执行副作用：git tag / GitHub release / deploy（插件化）

产物：

执行结果事件（建议另写 event ledger：runtime_data/executions.jsonl）
失败也要落盘（失败不是蒸发，是资产）

不变量：

Executor 永远不接受裸 inputs（参数绕过就是治理绕过）
ticket 是唯一执行许可（ticket-gated execution）

(7) Replay / Regression：以 record 为真源重算裁决（不是重跑流程）

输入：DecisionRecord + 对应 canonical inputs + policy_ref

动作（deterministic）：

重算 policy decision
比对：重算结果是否与 record 一致
产出回归报告（用于制度演化与防回归）

不变量：

replay 目标是“重算当年的判断”，不是“重跑当年的副作用”
policy 变更必须导致“版本变化”，否则无法解释差异

3.3 两条关键“制度锁”（把主流程变成不可绕过）

唯一入口锁：所有 repo.release 的裁决请求必须经过 Capability Bus（单点收口）。
副作用锁：所有副作用必须经过 Ticket 校验（单点门禁）。

这两条锁加上 append-only ledger，RRB 才真正从“bot”变成“器官”。

3.4 最小 v0 执行策略（让你能立刻落地而不冒险）

v0 先做到：Intent → DecisionRecord（不自动 tag，不 deploy）
v0.1 加：ALLOW 才签发 Ticket + ticket-gated tag
v0.2 再加：release/deploy 插件化 + execution events ledger
每一步都必须保持 replay 可用：主权区先稳，功能再长

(1) Intent declared  ───────────────┐
                                    ▼
(2) Orchestrator builds Proposal (optional LLM drafting allowed here only)
                                    ▼
(3) Compiler/Validator (deterministic)
    - schema validate
    - canonicalize
    - request_hash
                                    ▼
(4) MCP Capability Bus (single legal entry)
                                    ▼
(5) Policy Gate (deterministic)
    → ALLOW / DENY / REQUIRE_OVERRIDE
                                    ▼
(6) DecisionLedger.append(record)  (append-only, before side effects)
                                    ▼
(7) Execution (optional, ticket-gated)
    - only if decision==ALLOW
    - tag/release/deploy plugins optional
                                    ▼
(8) Replay/Regression
    - same canonical inputs + same policy_version => same decision

Triggering Mechanism(触发机制)

4) Triggering Mechanism

The triggering mechanism is RRB’s sovereign gate. It does not answer the question “when should a workflow run,” but a far more fundamental one:
Who has the authority to advance system history, and in what form responsibility is borne.
Therefore, triggering is not a technical event—it is an institutional statement.

4.1 The Only Legitimate Trigger: ReleaseIntent

Restated definition:

A ReleaseIntent is a Responsibility Declaration.

Its meaning is not “the system should release,” but:

“An accountable subject explicitly requests the system to perform a release, and is willing to bear responsibility for that decision in the future.”

Only once this semantic condition is satisfied does RRB allow the governance process to begin; otherwise, the system must remain still.

Core principles:

Trigger ≠ Event
Trigger = Intent
Intent = accountable subject + explicit intention + risk acceptance

4.2 Legitimate Trigger Entry Points (v0: Most Stable)

✅ CLI: `rrb intent create ...` (Human Explicit)

This is the only recommended trigger implementation in v0.

Why CLI is the most stable starting point:

Clear human-in-the-loop presence
Cannot happen “silently”
Easy to record actor.id / actor.kind
Naturally trains users into a mindset of declaring responsibility

Institutional meaning:

Every release is an explicit act
There is no gray zone where “the system decided to release by itself”

CLI is chosen not for technical convenience, but because it forces slowness at both the psychological and institutional levels.

🔜 Extensible Entry Points (Semantics Must Be Equivalent)

These entry points are not new trigger types, but merely different input adapters for ReleaseIntent.

PR Comment: `/release`

Must satisfy:

comment author = human / institution
comment → compiled into ReleaseIntent v1
the comment itself is not the trigger — the Intent is the trigger

All comment-based triggers must ultimately:

be written to release_intents.jsonl
enter the same main governance flow

UI Button (Approval Desk / Console)

Essentially a “visual CLI”
UI actions must generate structured Intent
UI must not bypass:
- intent schema
- append-only writing
- actor recording

Unified invariant:

Regardless of entry form, the system ultimately sees only ReleaseIntent.

4.3 Explicitly Forbidden Triggers (Red Lines)

These prohibitions are not “temporarily unsupported features,” but structural sources of danger.

🚫 Automatic Release on Merge

Why this is dangerous:

A merge is a collaboration event, not a responsibility declaration
merge author ≠ release responsibility holder
It easily leads to:
- “I just merged the code—who knew the system would release it?”

The only exception (strict conditions):

The merge rule itself is institutionalized
The merge action explicitly generates Institutional Intent
That rule is:
- versioned
- auditable
- replayable

Otherwise, it is strictly forbidden.

🚫 Automatic Trigger on CI Green

The fundamental issue:

CI answers “can it run?”
Release answers “should it happen?”

Treating CI green as a trigger is equivalent to saying:

“As long as nothing breaks, the system may advance history on its own.”

This is precisely the condition you are trying to avoid:

a system advancing its own history with no declaration and no responsible party.

CI results may only serve as:

evidence
one of the policy inputs

They must never be triggers.

🚫 Agent-Automated Triggering

This is the most critical red line.

The reason is not “models aren’t good enough yet,” but structural:

agents ≠ responsible subjects
agent objective functions ≠ institutional responsibility
agents cannot be held accountable ten years later

No matter how intelligent agents become in the future, they may only:

propose options
generate proposals
annotate risks

They must never declare Intent.

Once agents are allowed to trigger Intent, the system loses:

its human responsibility anchor
institutional legitimacy
long-term explainability

4.4 A Critical Structural Judgment (Unique to This System)

Your trigger design operationalizes a rare but crucial principle in engineering:

History is not advanced by “conditions being satisfied,” but by responsibility being declared.

This means:

Release is no longer a natural outcome of CI/CD
It is an institutional act
The system upgrades from an “automation tool” to a responsibility-executing entity

4.5 One-Sentence Summary (Constitution-Ready)

RRB responds only to Intent, not to events.
Without a responsibility declaration, the system would rather do nothing.
This guarantees that every advance of history has someone who can step forward in the future and say: this was my decision at the time.

4) 触发机制（Triggering）

触发机制是 RRB 的主权闸门。它解决的不是“什么时候跑流程”，而是一个更根本的问题：
谁有权让系统推进历史，并以什么形式承担责任。
因此，触发不是技术事件，而是制度语句。

4.1 唯一合法触发：ReleaseIntent

定义重申：

ReleaseIntent 是一次责任声明（Responsibility Declaration）。

它的语义不是“系统应该发布”，而是：

“某个可追责主体，明确要求系统执行一次 release，并愿意在未来为该决定负责。”

这条语义一旦成立，RRB 才允许进入治理流程；否则，系统必须保持静止。

核心原则：

Trigger ≠ Event
Trigger = Intent
Intent = 可追责主体 + 明确意图 + 风险承担

4.2 合法触发入口（v0：最稳态）

✅ CLI：`rrb intent create ...`（Human Explicit）

这是 v0 阶段唯一推荐实现的触发方式。

为什么 CLI 是最稳的起点：

明确的人类在场（human-in-the-loop）
不可“悄悄发生”
易于记录 actor.id / actor.kind
易于训练使用者形成“声明责任”的心智模型

制度意义：

每一次 release 都是一次显式行为
不存在“系统自己决定发布了”的灰区

CLI 并不是因为技术方便，而是因为它在心理与制度层面都强制慢下来。

🔜 可扩展入口（但语义必须等价）

这些入口不是新的触发类型，只是 ReleaseIntent 的不同“输入适配器”。

PR Comment：`/release`

必须满足：
- comment author = human / institution
- comment → 编译成 ReleaseIntent v1
- comment 本身不是触发，Intent 才是触发
所有 comment 触发，最终都要：
- 写入 release_intents.jsonl
- 进入同一主流程

UI Button（审批台 / 控制台）

本质是一个“可视化 CLI”
UI 行为必须生成结构化 Intent
UI 不得绕过：
- intent schema
- append-only 写入
- actor 记录

统一不变量：

无论入口形态如何，最终系统只“看到” ReleaseIntent。

4.3 明确禁止的触发方式（红线）

这些禁止项不是“暂时不支持”，而是结构性危险源。

🚫 merge 自动触发 release

为什么危险：

merge 是协作事件，不是责任声明
merge author ≠ release responsibility holder
极易导致：
- “我只是合了代码，谁知道系统就发布了”

唯一例外（严格条件）：

merge 规则本身被制度化
merge 动作明确生成 Institutional Intent
该规则：
- 版本化
- 可审计
- 可 replay

否则，一律禁止。

🚫 CI green 自动触发

根本问题：

CI 是“证明能不能跑”
Release 是“判断应不应该发生”

把 CI green 当触发条件，本质是在说：

“只要没出错，系统就可以自行推进历史”

这正是你要避免的那种：

系统在无人声明、无人承担责任的情况下自我推进历史

CI 结果只能作为：

evidence
policy 输入之一
永远不能作为 trigger。

🚫 agent 自动触发

这是最关键的一条红线。

原因不是“现在模型不够好”，而是结构性原因：

agent ≠ 责任主体
agent 的目标函数 ≠ 制度责任
agent 无法在十年后被追责

即使未来 agent 再聪明，也只能：

提出建议
生成 proposal
标注风险

但它永远不能声明 Intent。

一旦允许 agent 触发 Intent，系统就失去了：

人类责任锚点
制度合法性
长期可解释性

4.4 一个重要的结构判断（你这个系统特有）

你这里的 Trigger 设计，实际上在工程上落地了一个非常罕见但极其关键的判断：

历史不是被“条件满足”推动的，而是被“责任声明”推动的。

这意味着：

Release 不再是 CI/CD 的自然结果
而是一次制度行为
系统从“自动化工具”升级为“责任执行体”

4.5 一句话总结（可以直接放进 Constitution）

RRB 只响应 Intent，不响应事件。
没有责任声明，系统宁可什么都不做。
这是为了保证：每一次历史推进，都有人能在未来站出来说——这是我当时的决定。

Executor (执行器)

5) Executor Design Principles

The Executor is the most dangerous—and therefore the most in need of domestication—component in the entire RRB architecture.
Once it oversteps its authority, it can change the world with no institutional trace whatsoever.
Therefore, the Executor is not “code that gets things done,” but a side-effect channel fully domesticated by institutions.

5.1 The Executor’s Identity: A Side-Effect Port, Not a Decision-Maker

Definition:

The Executor has exactly one responsibility: to perform side effects (tag / release / deploy / publish) if and only if they are explicitly authorized by institutional judgment.

It does not understand Intent.

It does not understand Policy.

It does not judge whether something is “reasonable.”

It answers only one question:

“Is this ticket sufficient for me to execute?”

If the answer is not a definitive YES, it must refuse.

5.2 Function Signature Red Line: No Raw Inputs

This is the single most critical engineering red line in Executor design.

Forbidden function shapes (any one of these constitutes governance bypass):

def execute_release(tag: str): ...
def deploy(version: str, env: str): ...
def run(cmd: str): ...

These interfaces conflate whether execution is allowed with how execution is performed,

implicitly trusting the caller and eliminating institutional constraints at the engineering level.

The only allowed function signatures:

def execute(decision_id: str): ...
# or
def execute_with_ticket(ticket: DecisionTicket): ...

The Executor may accept only:

a decision_id (which it then uses to look up the ledger / ticket itself), or
a complete DecisionTicket

Institutional meaning:

The Executor never knows “who provided the business parameters”
It knows only this: I am executing side effects on behalf of an institutional fact

5.3 Mandatory Pre-Execution Validation Chain

(Gate Before Effects)

Before any side effect occurs, the Executor must complete a full, non-skippable validation chain.

This chain must be a hard failure chain (any failure raises immediately and terminates execution).

(1) Decision Must Be `ALLOW`

ticket.decision == ALLOW

DENY and REQUIRE_OVERRIDE are explicitly forbidden execution states
An override does not “modify the decision”; it produces a new ALLOW ticket

(2) `request_hash` Must Match

ticket.request_hash == sha256(canonical_request)

Prevents “execution object substitution”
Prevents “using ticket A to execute side effects for B”
request_hash is the cryptographic anchor between execution and institutional judgment

(3) `policy_version (+ hash)` Must Match

ticket.policy_ref == decision_record.policy_ref

or (more strictly):

ticket.policy_ref == current_policy_ref

Meaning:

Ensures execution occurs under the same institutional context
Prevents “allowed under old policy, secretly executed under new policy”

(4) Ticket Not Revoked / Not Expired

(Optional but strongly recommended)

ticket.revoked == false
now < ticket.expires_at

Purpose:

Supports emergency human intervention
Supports execution time windows (e.g. “must execute within 24 hours”)
Converts execution authority from “forever valid” into a controlled resource

5.4 Failure Semantics: Better to Do Nothing Than to Do Too Much

The Executor’s failure philosophy must be:

“Refusing to execute is a successful failure; unauthorized execution is a catastrophe.”

Therefore:

Validation failure must:
- produce no side effects
- raise a clear error
- optionally append an execution-denied event (append-only)
The Executor must never attempt self-recovery
(e.g. “retry automatically with different parameters”)

5.5 Temporal Discipline Between Execution and Recording

The Executor must never be the first component to write history.

Correct order:

Policy Gate adjudication
DecisionRecord append (institutional fact)
DecisionTicket issuance
Executor ticket validation
Side effects occur
Execution Event recorded (append-only)

Forbidden order:

Execute first, write records later
Execute and record nothing on failure
Treat execution logs as “historical facts”

5.6 The Executor Must Be the “Dumbest” Component

This is counter-intuitive, but absolutely essential:

The dumber the Executor, the safer the system.

The Executor must not:

understand business semantics
dynamically construct commands
decide execution paths
“guess” parameters from the environment

It should only:

validate tickets
invoke explicitly declared side-effect plugins
record results

5.7 A Constitution-Ready Summary Sentence

The Executor is not a “program that executes,” but a port authorized by institutions.
All of its power comes from a ticket; without a ticket, it can do nothing.

5.8 Minimal Engineering Principles for Immediate Implementation (v0)

If you start coding right now, remember just these three rules:

Function signatures accept only ticket or decision_id
All validation happens before any side effect
Better to reject 100 legitimate executions than to allow 1 unauthorized execution

5) 执行器（Executor）设计要点

Executor 是整套 RRB 架构里“最危险、也最必须被驯化”的部件。
它一旦越权，就能在毫无制度痕迹的情况下改变世界。
因此，Executor 不是“会干活的代码”，而是被制度完全驯服的副作用通道。

5.1 Executor 的身份：副作用端口，不是决策者

定义：

Executor 的唯一职责是：在且仅在制度明确授权的前提下，执行副作用（tag / release / deploy / publish）。

它不理解 Intent，不理解 Policy，也不判断“合不合理”。

它只回答一个问题：

“这张票据，是否足以让我执行？”

如果答案不是确定的 YES，它必须拒绝。

5.2 函数签名红线：不接受裸 inputs（No Raw Inputs）

这是 Executor 设计中最关键的一条工程红线。

禁止的函数形态（任何一种都意味着治理绕过）：

def execute_release(tag: str): ...
def deploy(version: str, env: str): ...
def run(cmd: str): ...

这些接口把“能不能执行”与“怎么执行”混在一起，

等于默认信任调用方，在工程层面取消了制度约束。

唯一允许的函数签名形态：

def execute(decision_id: str): ...
# or
def execute_with_ticket(ticket: DecisionTicket): ...

Executor 只接受：

decision_id（再由它自己去查 ledger / ticket）
或完整的 DecisionTicket

制度含义：

执行器永远不知道“业务参数是谁给的”
它只知道：我是在为一条制度事实执行副作用

5.3 执行前的强制校验链（Gate Before Effects）

在任何副作用发生之前，Executor 必须完成一整套不可跳过的验证链。

这条链必须是 hard failure（不满足就抛异常、立即终止）。

(1) Decision 必须是 `ALLOW`

ticket.decision == ALLOW

DENY 与 REQUIRE_OVERRIDE 都是明确禁止执行的状态
override 不是“修改 decision”，而是产生一张新的 ALLOW 票据

(2) `request_hash` 必须匹配

ticket.request_hash == sha256(canonical_request)

防止“偷换执行对象”
防止“用 A 的票据，执行 B 的副作用”
request_hash 是执行与制度判断之间的密码学锚点

(3) `policy_version (+ hash)` 必须匹配

ticket.policy_ref == decision_record.policy_ref

或（更严格）：

ticket.policy_ref == current_policy_ref

意义：

确保执行发生在同一制度语境下
防止“旧制度下允许，新制度下偷偷执行”

(4) Ticket 未撤销 / 未过期（可选但强烈建议）

ticket.revoked == false
now < ticket.expires_at

用途：

支持人工紧急止血
支持时间窗执行（比如“只允许 24h 内执行”）
把执行权从“永远有效”变成“受控资源”

5.4 失败策略：宁可不做，也不能多做

Executor 的失败语义必须是：

“拒绝执行是成功的失败；越权执行是灾难。”

因此：

校验失败必须：
- 不产生任何副作用
- 产生明确错误
- 可选：写入 execution-denied 事件（append-only）
Executor 不得尝试补救（比如“自动 retry 用别的参数”）

5.5 执行与记录的时序纪律

Executor 永远不是第一个写入历史的组件。

正确顺序：

Policy Gate 裁决
DecisionRecord append（制度事实）
DecisionTicket 签发
Executor 校验 ticket
副作用发生
Execution Event 记录（append-only）

错误顺序（禁止）：

先执行，再补写记录
执行失败就什么都不记
把执行日志当成“历史事实”

5.6 Executor 必须是“最笨”的组件

这是一个反直觉但极其重要的设计目标：

Executor 越笨，系统越安全。

它不应该：

理解业务语义
动态构造命令
决定执行路径
从环境中“猜”参数

它只应该：

校验票据
调用明确声明过副作用的插件
把结果记下来

5.7 一个可以写进宪法的总结句

Executor 不是一个“会执行的程序”，而是一个“被制度许可的端口”。
它的所有能力，都来自于一张票据；没有票据，它什么都做不了。

5.8 给你一个工程落地的最小原则（v0）

如果你现在要立刻写代码，记住这三条就够：

函数签名只收 ticket / decision_id
所有校验失败都在副作用前
宁可拒绝 100 次合法执行，也不能放过 1 次越权执行

LLM 边界（Model-to-Executable Firewall）

1) LLM Boundary (Model-to-Executable Firewall)

This section defines RRB’s civilizational firewall.
It is not about limiting model capability, but about fully isolating uncertainty from historical sovereignty.
In one sentence: LLMs may participate in thinking, but must never directly touch history or execution.

1.1 The Core Purpose of the Firewall: Confining “Uncertainty” to a Controllable Zone

The essential characteristics of LLMs are not that they are “not smart,” but that they are:

Non-deterministic (the same input does not guarantee the same output)
Not fully replayable (model versions, context, and vendor state drift over time)
Opaque internal state (not auditable)

These properties are assets in the cognitive / proposal phase,

but once they enter institutional judgment, historical writing, or execution authorization, they become systemic risks.

Therefore, RRB must establish a hard firewall:

LLM outputs may influence how humans think, but must never directly influence how the system acts.

1.2 The Only Legitimate Habitat for LLMs: The Proposal Zone

The role of the Proposal Zone:

It is a cognitive augmentation zone, not an institutional zone.

Here, LLMs may—and are well suited to—do the following:

Draft candidate CapabilityRequests
Summarize diffs and change impact
Generate risk checklists
Propose rollback strategies
Explain to humans “what might happen if this is released”

But the legal status of all their outputs is exactly one thing:

Proposal

—not:

Judgment
Fact
Authorization
Instruction

1.3 Explicitly Forbidden Zones (Sovereign Red Lines)

LLM outputs are always, unconditionally, and without exception forbidden from entering the following areas:

🚫 Policy (Institutional Judgment)

The Policy Gate must be a fully deterministic function
Same canonical input + same policy_version → same decision
Once LLMs participate in policy:
- replay fails immediately
- decisions become irreproducible
- institutions lose credibility

🚫 Scheduler (Temporal and Execution Scheduling)

The Scheduler determines when something may happen
This is temporal control over real-world side effects
Any LLM-based scheduling introduces:
- implicit state
- inexplicable delays or accelerations
- causality errors that are hard to reconstruct

🚫 Ledger (Historical Fact Writing)

The Ledger is institutional memory
It is not a log, but “the set of responsibilities the system has taken on over time”
LLMs must not:
- write
- modify
- synthesize
- summarize and then write (summaries are also forbidden)

🚫 Canonical Memory (Replay Source of Truth)

Canonical Memory is the input anchor for replay
It must satisfy:
same input → same output
LLM outputs inherently fail this property, and therefore may only serve as “reference,” never as “truth”

🚫 Execution Gate (Execution Authorization)

Execution authorization is the key to changing the world
If LLM outputs can directly affect the execution gate, it implies:
- agents autonomously advancing history
- no human responsibility anchor
- systems that cannot be governed long-term

1.4 Mandatory Metadata: `origin` and `replayable`

To turn “boundaries” from philosophical agreements into engineering facts,

RRB requires that all data objects entering the system explicitly declare their origin and replayability.

`origin`

Answers the question: “Who produced this?”

Recommended enum values:

human
institution
model
system

Rules:

Objects with origin=model are automatically downgraded to Proposals
Objects with origin=model must not be read by policy / ledger / executor

`replayable`

Answers the question: “Can this be recomputed ten years later?”

Recommended semantics:

replayable=true
→ may serve as replay source of truth (canonical inputs / policy / records)
replayable=false
→ may only serve as explanation, annotation, reference, or proposal

Critical red lines:

All objects entering DecisionRecord / Canonical Memory / Ticket validation chains must be replayable=true
LLM outputs default to replayable=false, unless explicitly extracted and reconstructed into structured fields by a deterministic compiler

1.5 The Firewall Is Not Distrust of Models, but Distrust of Historical Contamination

This point is often misunderstood and must be made explicit:

The Model-to-Executable Firewall is not about “LLMs being untrustworthy,” but about history needing to be more trustworthy than models.

You allow models to:

help you think
help you explain
help you propose

But you do not allow models to:

bear responsibility on your behalf
write history for you
decide execution for you

1.6 A Constitution-Ready Definition Sentence

LLMs are cognitive collaborators, not institutional subjects.
They may participate in forming judgments, but must never become the judgments themselves.

1.7 Minimal Engineering Checklist (v0)

If you start coding now and enforce just these four points, you will already outperform 99% of systems:

All LLM outputs must carry origin=model
All origin=model objects default to replayable=false
Policy / Ledger / Executor layers must not read origin=model data
Only deterministic compilers may transform proposals into canonical inputs

1) LLM 边界（Model-to-Executable Firewall）

这一节定义的是 RRB 的“文明防火墙”。
它不是在限制模型能力，而是在把不确定性与历史主权彻底隔离。
一句话：LLM 可以参与思考，但永远不能直接触碰历史与执行。

1.1 防火墙的核心目的：把“不确定性”关在可控区

LLM 的本质特征不是“不聪明”，而是：

非确定性（同输入未必同输出）
不可完全重放（模型版本、上下文、供应商状态都会漂移）
不可审计的内部状态

这些特征在 认知/提案阶段是资产，

但一旦进入 制度判断、历史写入或执行许可，就会变成系统性风险。

因此，RRB 必须建立一条硬防火墙：

LLM 的输出只能影响“人类如何思考”，不能直接影响“系统如何行动”。

1.2 LLM 的唯一合法生存区：Proposal Zone

Proposal Zone 的角色：

是“认知增强区”，不是“制度区”。

在这里，LLM 可以做、而且非常适合做以下事情：

起草候选 CapabilityRequest
总结 diff / 变更影响
生成风险点清单
提供回滚建议
给人类解释“如果发布，可能发生什么”

但它所有产物的法律地位只有一个：

Proposal（提案）

而不是：

判断
事实
授权
指令

1.3 明确禁止进入的区域（主权红线）

LLM 输出 永远、无条件、不可例外地 禁止进入以下区域：

🚫 Policy（制度判断）

Policy Gate 必须是 完全确定性函数
同 canonical input + 同 policy_version → 必须同 decision
LLM 一旦参与 policy：
- replay 立即失效
- 决策不可复现
- 制度不再可信

🚫 Scheduler（时序与执行调度）

Scheduler 决定“什么时候能发生”
这是 现实世界副作用的时间控制权
任何基于 LLM 的调度都会引入：
- 隐式状态
- 不可解释的延迟/提前
- 难以回溯的因果错误

🚫 Ledger（历史事实写入）

Ledger 是 制度记忆
它不是日志，而是“系统在时间中承担过的责任集合”
LLM 不得：
- 写入
- 修改
- 合成
- 总结后再写入（summary 也是禁止的）

🚫 Canonical Memory（重放真源）

Canonical Memory 是 replay 的输入锚点
必须满足：
同输入 → 同输出
LLM 输出天生不满足这一性质，因此只能作为“参考”，不能作为“真源”

🚫 Execution Gate（执行许可）

执行许可是“世界改变的钥匙”
一旦 LLM 输出能直接影响执行门禁，就等于：
- agent 自主推进历史
- 无人类责任锚点
- 系统不可长期治理

1.4 强制元数据标记：`origin` 与 `replayable`

为了把“边界”从哲学约定变成工程事实，

RRB 要求所有进入系统的数据对象，必须显式声明来源与可重放性。

`origin`

用于回答：“这是谁产生的？”

推荐枚举：

human
institution
model
system

规则：

origin=model 的对象 自动降级为 Proposal
origin=model 的对象 不得被 policy / ledger / executor 读取

`replayable`

用于回答：“十年后还能不能重算？”

推荐语义：

replayable=true
→ 可作为 replay 真源（canonical input / policy / record）
replayable=false
→ 只能作为解释、注释、参考、提案

关键红线：

所有进入 DecisionRecord / Canonical Memory / Ticket 校验链 的对象，必须是 replayable=true
LLM 产物默认 replayable=false，除非被 deterministic compiler 明确抽取、重建为结构化字段

1.5 “防火墙”不是不信任模型，而是不信任历史污染

这是一个容易被误解的点，必须说清楚：

Model-to-Executable Firewall 并不是“LLM 不可信”，而是“历史必须比模型更可信”。

你允许：

模型帮你想
模型帮你解释
模型帮你提案

但你不允许：

模型替你承担责任
模型替你写历史
模型替你决定执行

1.6 一个可以直接写进宪法的定义句

LLM 是认知协作者，而不是制度主体。
它可以参与形成判断，但永远不能成为判断本身。

1.7 工程落地的最小检查表（v0）

如果你现在要开始写代码，只要强制这四点，就已经赢过 99% 系统：

所有 LLM 输出对象必须带 origin=model
所有 origin=model 对象默认 replayable=false
Policy / Ledger / Executor 层禁止读取 origin=model 数据
只有 deterministic compiler 才能把 proposal → canonical input

2) Intent Is Not an Event (Intent ≠ Event)

This section defines the most important semantic assertion in RRB.
If you were allowed to lock in only one sentence across the entire system, it would be this:
“Events describe the world; Intent changes the world.”

2.1 The Ontological Difference Between Event and Intent

(They Are Not the Same Kind of Thing)

Event and Intent belong to two fundamentally different semantic categories:

Core conclusion:

Events may be recorded, referenced, and evaluated;

but only Intent is qualified to move the system into a new historical state.

2.2 Why Events Must Not Advance History

If a system allows Events to advance history, three irreparable problems immediately emerge:

(1) Responsibility Evaporation

When merges, CI green signals, or cron schedules directly trigger releases:

The system can explain what happened
But it cannot explain who decided it should happen

Ten years later, during an audit, all you see is a chain like:

test passed → release happened

But you cannot answer:

Who, at the time, took responsibility for judging that releasing was appropriate?

(2) Unconscious History Advancement

Event-driven systems have a dangerous property:

As long as conditions are met, history moves forward by itself.

This means:

No human needs to be present
No explicit declaration is required
No responsibility anchor exists

This is exactly the state you have repeatedly emphasized must be avoided:

the system unconsciously advancing its own history.

(3) Replay and Audit Collapse

Events often depend on:

current time
current environment
current network state

All of these are implicit states that cannot be replayed.

Once Events become triggers:

replay can only describe what happened back then
but cannot recompute why it happened that way

This destroys the foundation of the entire replay / regression system.

2.3 The Necessary Components of Intent: Declaring a Responsible Subject

A legitimate Intent must answer at least four questions:

Who is requesting the system to advance history? (actor)
What is being requested? (scope / capability)
Why now? (reason)
Who bears the consequences if things go wrong? (risk acceptance / rollback)

Therefore, Intent can only originate from:

human
institution

Because only these subjects:

can be named
can be held accountable
can explain their judgment in the future

2.4 The Role of Agents: Proposal Only, Never Intent

This red line is easily challenged by technical intuition—but it must be fixed firmly.

What agents may do:

observe events
analyze risk
generate suggestions
draft proposals
alert humans that “this may be a release window”

What agents must never do:

declare Intent
trigger historical advancement
assume institutional responsibility on behalf of humans

The reason is not that agents are “not smart enough,” but that:

Agents have no institutional identity.

They cannot be held accountable ten years later:

“Why did you think releasing was appropriate at the time?”
“What risk assessment did you base that on?”
“If the consequences were severe, who would take responsibility?”

2.5 A Critical but Subtle Point

What your system is actually doing is a civilizational-level engineering move:

It completely separates “what happened” from “what should happen.”

Most automation systems conflate these two:

once conditions are satisfied,
they automatically assume the system should advance state

RRB’s strict Intent/Event separation is a deliberate engineering refusal of that shortcut.

2.6 Hard Engineering Constraints (Not Verbal Agreements)

To ensure that Intent ≠ Event is more than rhetoric, the following hard constraints must exist:

All trigger APIs may accept only Intent schemas
Event streams (CI, GitHub webhooks, cron):
- may write evidence only
- must never call the Capability Bus
The Capability Bus must reject:
- actor.kind == agent
- origin == model

2.7 A Constitution-Ready Summary Sentence

Events tell us what happened in the world; Intent decides whether we allow the world to change.
RRB recognizes only the latter.

2) Intent 不是事件（Intent ≠ Event）

这一节定义的是 RRB 最重要的语义断言。
如果只允许你在整个系统里钉死一句话，那就是这一句：
“事件描述世界，Intent 改变世界。”

2.1 Event 与 Intent 的本体差异（不是同一类东西）

Event（事件） 与 Intent（意图） 在语义上属于两个完全不同的范畴：

核心结论：

Event 可以被记录、被引用、被评估；

但 只有 Intent 才有资格推动系统进入新的历史状态。

2.2 为什么“事件不能推进历史”

如果系统允许 Event 推进历史，会立即出现三类不可修复的问题：

(1) 责任蒸发（Responsibility Evaporation）

当 merge、CI green、cron 到点直接触发 release 时：

系统能说清楚“发生了什么”
却说不清楚“是谁决定让它发生的”

十年后你面对审计时，只能看到一条链：

test passed → release happened

但你无法回答：

是谁在当时承担了“发布是合理的”这个判断？

(2) 系统自推进历史（Unconscious History Advancement）

Event 驱动的系统有一个危险特性：

只要条件满足，历史就会自己往前走。

这意味着：

系统不需要人类在场
系统不需要明确声明
系统不需要责任锚点

这正是你反复强调要避免的状态：

系统在不知不觉中“自我推进历史”。

(3) Replay 与审计失效

Event 往往依赖：

当前时间
当前环境
当前网络状态

这些都是 不可重放的隐式状态。

一旦 Event 成为 trigger：

replay 只能“描述当年发生了什么”
却无法“重算当年为什么这样发生”

这会让你的整个 replay / regression 体系失去根基。

2.3 Intent 的必要构成：责任主体声明

一个合法的 Intent，至少必须回答四个问题：

谁要求系统推进历史？（actor）
要做什么？（scope / capability）
为什么现在要做？（reason）
如果出错，谁承担后果？（risk acceptance / rollback）

因此，Intent 天然只能来自：

human
institution

因为只有这两类主体：

能被点名
能被追责
能在未来解释当时的判断

2.4 Agent 的位置：只能 Proposal，永不得 Intent

这条红线非常容易被技术直觉挑战，但必须钉死。

Agent 能做什么：

观察事件
分析风险
生成建议
起草 proposal
提醒人类“现在可能是个发布窗口”

Agent 绝不能做什么：

声明 Intent
触发历史推进
代表人类承担制度责任

原因不是“agent 不够聪明”，而是：

Agent 没有制度身份。

它不能在十年后被问责：

“你当时为什么觉得应该发布？”
“你依据的风险判断是什么？”
“如果后果严重，谁来承担？”

2.5 一个非常关键但隐蔽的点

你这个系统实际上在做一件文明级的工程化动作：

把“发生了”与“应该发生”彻底拆开。

大多数自动化系统混淆了这两件事：

只要“发生条件满足”
就自动认为“应该推进状态”

而 RRB 的 Intent/Event 区分，正是在工程上拒绝这种偷懒。

2.6 工程层面的硬约束（不是口头约定）

为了确保 Intent ≠ Event 不是一句空话，必须有以下硬约束：

所有 trigger API 只能接受 Intent schema
Event 流（CI、GitHub webhook、cron）：
- 只能写 evidence
- 永远不能调用 Capability Bus
Capability Bus 拒绝：
- actor.kind == agent
- origin == model

2.7 一个可以直接写进宪法的总结句

事件告诉我们世界发生了什么；Intent 决定我们是否要让世界发生改变。
RRB 只承认后者。

“All Green” Is Not a Precondition for Release (“全绿”不是 release 的前提)

“All Green” Is Not a Precondition for Release

This section dismantles the most dangerous sleight of hand in engineering intuition:
“If everything is green, we should release.”
In the RRB system, this statement is semantically wrong, institutionally dangerous, and unacceptable for long-lived systems.

3.1 A Fundamental Distinction: Can It Run ≠ Should It Happen

“What’s all green?” answers the question:

Can this system run, technically?

A release, however, answers a very different question:

Under the current institutional rules, risk profile, and responsibility commitments, should this world state be advanced?

These belong to entirely different layers of judgment.

Conclusion:

“All green” can be at most a necessary condition for release—but it can never be a sufficient condition.

3.2 Release Is an Institutional Judgment, Not an Engineering Outcome

In RRB, the output of a release is not:

“Version X has been published”

but rather one of:

ALLOW — the institution permits entry into the execution-authorization phase
DENY — the institution explicitly refuses this historical advancement
REQUIRE_OVERRIDE — higher authority must assume risk to proceed

This is a constitutional-style judgment, not the natural output of a build system.

Institutional judgments may consider, but are not limited to:

Whether the responsible subject is clearly identified
Whether the scope of change aligns with current policy
Whether risk lies within acceptable bounds
Whether the system is in a freeze or sensitive window
Whether unresolved institutional risks remain

Many of these factors are:

entirely unrelated to CI being green
yet crucial to whether the decision can be explained ten years later

3.3 The Correct Place for “All Green”: The Execution Permission Layer

In the RRB architecture, “all green” has a place—but it is not a trigger.

The correct hierarchy is:

Intent (responsibility declaration)
   ↓
Policy Gate (institutional judgment)
   ↓
DecisionRecord (historical fact)
   ↓
Execution Permission (side-effect authorization)
   ├─ CI green?
   ├─ environment ready?
   └─ ticket valid?
   ↓
Execution

That is:

The Policy Gate decides whether execution should be allowed
The Execution Permission layer decides whether side effects can be safely executed now

“All green” belongs to the latter.

3.4 Why Misplacing “All Green” Is Fatal

(1) “All Green” Is a Momentary State, Not an Institutional Fact

CI green only says “no errors detected at this moment”
It does not guarantee:
- sound risk assessment
- appropriate timing
- informed and acknowledged responsibility

Treating it as a release gate effectively asserts:

“As long as nothing fails right now, the world may be permanently changed.”

(2) “All Green” Inherently Depends on Implicit External State

CI results depend on:

runner environments
timing
network conditions
third-party service availability

All of these are not stably replayable.

Once CI green becomes a gate:

replay can only recount that it was green back then
it cannot recompute whether green should have meant release

(3) It Quietly Transfers Responsibility from Humans to the System

This is the most dangerous effect.

Once “all green = release” becomes habitual, team psychology subtly shifts to:

“The system released it—not me.”

And that is precisely what the entire RRB architecture is designed to structurally prevent.

3.5 The Proper Role of Regression: Evidence, Not Adjudication

In RRB, the institutional status of regression, replay, and test results is:

Evidence

They may:

support or weaken a release Intent
serve as one of the policy input variables
trigger a REQUIRE_OVERRIDE (e.g., inconsistent regression)

But they must never:

trigger a release on their own
directly produce a DecisionTicket
bypass the Policy Gate

3.6 Hard Engineering Constraints (Disabling the Intuition)

To prevent the “all green means release” reflex from creeping back in, the system must hard-code the following:

The Capability Bus rejects any:
- trigger == ci_green
- actor.kind == system
The Policy Gate input schema:
- contains no shortcut like “ci_green == true → auto allow”
During ticket validation, the Executor:
- treats CI status as an execution precondition
- not as authorization

3.7 A Constitution-Ready Summary Sentence

“All green” tells us a system can run—but only institutional judgment tells us a system should run.
RRB cares not about “can it,” but about “should it.”

“全绿”不是 release 的前提

这一节的作用，是把工程直觉里最危险的一个偷换彻底拆掉：
“只要都绿了，就该发布。”
在 RRB 体系里，这句话在语义上是错误的，在制度上是危险的，在长期系统中是不可接受的。

3.1 一个根本性的区分：能不能跑 ≠ 应不应该发生

“全绿”回答的问题是：

这个系统在技术上能不能运行？

而 release 回答的问题是：

在当前制度、风险、责任承担条件下，这个世界状态应不应该被推进？

这两者属于完全不同的判断层级。

结论：

全绿最多是 release 的必要条件之一，但绝不可能是充分条件。

3.2 Release 是制度判断，不是工程结论

在 RRB 中，release 的输出不是：

“版本 X 已发布”

而是：

ALLOW：制度允许进入执行许可流程
DENY：制度明确拒绝这次历史推进
REQUIRE_OVERRIDE：制度要求更高权限承担风险

这是一个宪法式判断，而不是 build system 的自然结果。

制度判断考虑的因素包括但不限于：

责任主体是否明确
变更范围是否符合当前政策
风险是否在可接受区间
是否处在冻结期/敏感窗口
是否存在尚未缓解的制度性风险

其中很多因素：

与 CI 是否全绿 完全无关
却与“十年后能否解释这次决定”高度相关

3.3 全绿的正确位置：Execution Permission（副作用许可）层

在 RRB 架构中，“全绿”有位置，但不是触发器。

正确的层级关系是：

Intent (责任声明)
   ↓
Policy Gate (制度判断)
   ↓
DecisionRecord (历史事实)
   ↓
Execution Permission (副作用许可)
   ├─ CI green?
   ├─ env ready?
   └─ ticket valid?
   ↓
Execution

也就是说：

Policy Gate 决定：应不应该允许执行
Execution Permission 决定：现在能不能安全执行副作用

全绿属于后者。

3.4 为什么把“全绿”放错层级是致命的

(1) 全绿是瞬时状态，不是制度事实

CI green 只说明“这一刻没发现错误”
它不保证：
- 风险评估合理
- 发布时机合适
- 责任主体知情并认可

如果把它当成 release gate，本质是在说：

“只要系统这一刻没报错，就可以永久改变世界状态。”

(2) 全绿天然依赖隐式外部状态

CI 结果依赖：

runner 环境
时间
网络
第三方服务状态

这些都不可稳定重放。

一旦 CI green 成为 gate：

replay 只能“复述当年绿过”
不能“重算当年是否应当绿了就发布”

(3) 它会悄悄把责任从人类挪给系统

这是最危险的一点。

当“全绿 = release”成为习惯后，团队心智会悄然转变为：

“系统发布了，不是我发布的。”

而这正是你整个 RRB 架构要结构性防止的事情。

3.5 Regression 的正确角色：证据，不是裁决

Regression / replay / test 结果，在 RRB 中的制度地位是：

Evidence（证据输入）

它们可以：

支持或削弱某个 release Intent
作为 policy 的输入变量之一
触发 REQUIRE_OVERRIDE（例如回归不一致）

但它们永远不能：

单独触发 release
直接产出 DecisionTicket
绕过 Policy Gate

3.6 工程层面的强制约束（让直觉失效）

为了防止“全绿即发布”的直觉回潮，系统必须硬编码以下规则：

Capability Bus 拒绝任何：
- trigger == ci_green
- actor.kind == system
Policy Gate 的输入 schema：
- 不包含“ci_green == true → auto allow”这样的捷径
Executor 在校验 ticket 时：
- 把 CI 状态当成 execution precondition
- 而不是 authorization

3.7 一个可以写进宪法的总结句

全绿说明系统能运行，但只有制度判断才能说明系统应该运行。
RRB 关心的不是“能不能”，而是“应不应该”。

DecisionRecord Must Be an Institutional Fact (Not a Log), DecisionRecord 必须是制度事实（不是日志）

DecisionRecord Must Be an Institutional Fact (Not a Log)

This section establishes the constitutional view of history for RRB.
If a DecisionRecord is treated as a log, the system is merely “keeping notes.”
Only when it is treated as an institutional fact is the system truly taking responsibility over time.

6.1 The Ontological Difference Between Logs and DecisionRecords

(They Are Fundamentally Different Things)

Many systems fail not because they lack features, but because they mistake logs for history.

Conclusion:

Logs answer “what the system was doing at the time.”

DecisionRecords answer “why the system chose to do (or not do) that at the time.”

6.2 Append-Only: History Can Only Be Added, Never Modified

Append-only is not a storage choice—it is an institutional choice.

Each DecisionRecord represents:

“At a specific moment, under a specific policy version, the system made an explicit judgment on a specific request.”

Once written, it must:

not be modified
not be overwritten
not be deleted
not be “corrected and replaced”

If the original judgment was wrong, the correct institutional action is:

append a new DecisionRecord or OverrideRecord
explicitly state:
- what the original judgment was
- why it is being overridden or corrected
- what additional risks the new judgment assumes

This is the mark of institutional maturity:

errors are preserved, not erased.

6.3 Overwrite Prohibition: Otherwise Replay and Accountability Collapse

The moment overwriting DecisionRecords is allowed, two things happen:

Replay breaks
- You can no longer recompute why something was allowed or denied at the time
- Because the original inputs and policy context have been polluted
Responsibility evaporates
- No one can be held accountable for the original decision
- Because that decision no longer “exists”

This immediately drags RRB back to the level of traditional CI/CD:

outcomes without a responsibility chain.

6.4 `policy_version` and `request_hash` Are Mandatory

These are the minimum sufficient conditions for a DecisionRecord to qualify as an institutional fact.

`request_hash`

Binds: what was adjudicated
Prevents:
- substitution of the adjudicated target
- post-hoc reinterpretation of a decision onto a different request
Serves as one of the replay input anchors

`policy_version` (+ hash)

Binds: what institutional rules were applied
Explicitly answers:

“Under which version of policy was this judgment made?”

Without it, you can only say “the system allowed it back then,” but not which rules justified that decision

Key conclusion:

A record without request_hash and policy_version is not a DecisionRecord—it is just a log.

6.5 Write the Record First, Then Execute Side Effects

(Order Is Constitutional)

This is the single most non-negotiable temporal discipline in the entire system:

Record-before-side-effects

Correct order:

Policy Gate produces a judgment
DecisionRecord is appended (institutional fact is established)
(Optional) DecisionTicket is issued
Executor validates the ticket
Side effects occur (tag / release / deploy)

Why this order is constitutional:

Even if execution fails:
- you still know what the system intended to do at the time
- failure does not “evaporate”; it becomes analyzable history
If execution succeeds but the record was never written:
- you create an untraceable historical mutation

In one sentence:

The world may fail, but history must never be blank.

6.6 OverrideRecord: Overrides Do Not Modify History, They Append Higher-Level Facts

Overrides are expensive and dangerous institutional actions, and therefore must be expressed with extreme rigor.

Mandatory fields (none may be omitted):

by: who performed the override (human / institution)
reason: why the original institutional judgment no longer applies
timestamp: when the override occurred
risk_acceptance: explicit statement of:
- which known risks are being accepted
- why those risks are acceptable now
- how the institution will later review or correct the policy

Critical red lines:

Overrides must not modify the original DecisionRecord
An override may only:
- reference the original decision_id
- append a new OverrideRecord
- and (typically) issue a new ALLOW ticket

This guarantees that:

the original institutional judgment still exists
override behavior is fully visible
institutional evolution has an evidence chain

6.7 Why DENY and OVERRIDE Are Assets

In many systems:

only successes are recorded
failures are treated as noise

RRB takes the opposite approach:

DENY is a historical record of explicit institutional refusal
OVERRIDE exposes points where the institution was under pressure

They are:

raw material for policy improvement
real data about risk patterns
signals of system maturity

A system with no DENY or OVERRIDE records

is not a perfect system—it is a system afraid to record its mistakes.

6.8 A Constitution-Ready Summary Sentence

A DecisionRecord is not a record of “what happened,” but an institutional fact of why the system judged the way it did at the time.
History may only be appended, never rewritten—otherwise the system loses responsibility over time.

DecisionRecord 必须是制度事实（不是日志）

这一节是在给 RRB 的“历史观”立宪。
如果 DecisionRecord 被当成日志（log），系统只是在“记事”；
只有当它被当成制度事实（institutional fact），系统才是在“对时间承担责任”。

6.1 Log 与 DecisionRecord 的本体差异（这是两种完全不同的东西）

很多系统失败，不是因为功能不够，而是因为把日志当成历史。

结论：

日志回答的是 “系统当时在做什么”；

DecisionRecord 回答的是 “系统当时为什么这样做（或不做）”。

6.2 append-only：历史只能被追加，不能被修改

append-only 不是存储选择，而是制度选择。

每一条 DecisionRecord 都代表：

“在某一时刻，系统在某一制度版本下，对某一请求做出了明确判断。”

一旦写入：

不得修改
不得覆盖
不得删除
不得“修正后替换”

如果当初判断错了，正确的制度动作是：

追加一条新的 DecisionRecord 或 OverrideRecord
明确说明：
- 原判断是什么
- 为什么要越权或纠正
- 新判断承担了什么额外风险

这正是制度成熟的标志：

错误被保留下来，而不是被抹去。

6.3 不可覆盖：否则 replay 与问责同时失效

一旦允许覆盖 DecisionRecord，就会发生两件事：

Replay 失效
- 你无法重算“当时为什么允许/拒绝”
- 因为“当时的输入与制度”已被污染
责任蒸发
- 没有人能为当初的决定负责
- 因为当初的决定已经“不存在”了

这会直接把 RRB 拉回到传统 CI/CD 的水平：

只有结果，没有责任链。

6.4 必须包含 `policy_version` / `request_hash`

这是 DecisionRecord 成为“制度事实”的最小充分条件。

`request_hash`

绑定：裁决对象是什么
防止：
- 偷换裁决目标
- “事后解释”把判断套到别的请求上
是 replay 的输入锚点之一

`policy_version`（+ hash）

绑定：裁决依据是什么
明确回答：

“这是在哪一版制度下做出的判断？”

没有它，你只能说“系统当年允许了”，却说不清楚依据的规则是什么

关键结论：

没有 request_hash 和 policy_version 的记录，不是 DecisionRecord，只是日志。

6.5 先写 record，再执行 side-effects（顺序即宪法）

这是整个系统里最不可妥协的一条时序纪律：

Record-before-side-effects

正确顺序：

Policy Gate 得出判断
DecisionRecord append（制度事实成立）
（可选）DecisionTicket 签发
Executor 校验票据
副作用发生（tag / release / deploy）

为什么这条顺序是宪法级的：

即使执行失败：
- 你仍然知道系统当年打算做什么
- 失败不是“蒸发”，而是可分析的历史
如果执行成功但 record 没写：
- 你制造了不可追溯的历史突变

一句话：

世界可以失败，但历史不能空白。

6.6 OverrideRecord：越权不是修改历史，而是追加更高层级事实

Override 是 昂贵且危险的制度动作，因此它的表达必须极其严谨。

硬约束字段（缺一不可）：

by：谁在越权（human / institution）
reason：为什么原制度判断不再适用
timestamp：越权发生的时间
risk_acceptance：明确说明：
- 接受哪些已知风险
- 为什么这些风险现在可以接受
- 后续如何复盘 / 修正制度

关键红线：

override 不能修改原 DecisionRecord
override 只能：
- 指向原 decision_id
- 追加一条新的 OverrideRecord
- 并（通常）签发一张新的 ALLOW 票据

这保证了：

原制度判断仍然存在
越权行为清晰可见
制度演化有证据链

6.7 为什么“DENY / OVERRIDE 也是资产”

在很多系统里：

成功才被记录
失败被当作噪声

RRB 反其道而行：

DENY 是制度明确拒绝的历史
OVERRIDE 是制度承压点的暴露

它们是：

policy 改进的原材料
风险模式的真实数据
系统成熟度的信号

一个没有 DENY / OVERRIDE 的系统，

不是完美系统，而是不敢记录错误的系统。

6.8 一个可以直接写进宪法的总结句

DecisionRecord 不是系统“发生过什么”的记录，而是系统“当时为什么这样判断”的制度事实。
历史只能被追加，不能被重写；否则，系统将失去对时间的责任。

Replay (不翻译，翻译了反而有歧义）

Replay Must Be Defined Correctly

Replay is the time self-verification mechanism of the entire RRB architecture.
If replay is misunderstood, the system may still appear to be running—but it has already lost ten-year-scale explainability.
Therefore, this section has a single goal: to decisively fix replay as “recomputing judgments,” not “rerunning the system.”

7.1 The Correct Definition of Replay: Recomputing the Original Judgment (Re-decision)

In RRB, replay ≠ re-run.

The strict definition of replay is:

Without introducing any new information,
using the policy version in effect at the time and the canonicalized inputs used at the time,
recompute the judgment that the Policy Gate should have produced,
and compare it for consistency with the historical DecisionRecord.

In other words, replay does not answer:

“How did the system run back then?”

It answers:

“Under the rules and inputs of that time, was this judgment inevitable?”

7.2 Why Replay Is Not Re-running the Workflow (Re-run Is Wrong)

Interpreting replay as “rerunning the workflow” immediately introduces three fatal problems:

(1) Side-Effect Contamination

Workflows include:

git operations
releases
deployments
network calls

These cannot—and must not—be replayed.

If replay required re-executing them, you would effectively be asking:

“Should we release again to prove that we released back then?”

This is logically absurd.

(2) Implicit State Drift

Workflows depend on implicit states such as:

current time
current environment
current external service state
current repo HEAD

These states:

existed back then
no longer exist, or are different today

Once replay depends on them, you can never prove:

“Was the original judgment necessarily correct at the time?”

(3) Replay Becomes Storytelling, Not Verification

Rerunning workflows can only tell you:

roughly what happened

But it cannot answer:

whether it should have happened

And that second question is exactly what RRB cares about.

7.3 The Sole Source of Truth for Replay: DecisionRecord

In RRB, DecisionRecord is the constitutional-level input to replay.

Replay inputs must be strictly limited to:

the DecisionRecord
the canonical inputs referenced at decision time (located via request_hash)
the policy_version + policy_hash used at decision time

Explicitly forbidden:

using current system state
using current policy
using current repo state
using any unrecorded historical context

Otherwise, you are not replaying—you are post-hoc rationalizing.

7.4 The Minimal Replay Input Set (Why It Must Be This Small)

The more inputs replay uses, the larger the explanation surface—and the weaker accountability becomes.

The minimal replay input set maps exactly to the key fields of a DecisionRecord:

request_hash
→ answers: what object was adjudicated at the time?
policy_version (+ hash)
→ answers: which institutional rules were applied at the time?
decision
→ answers: what judgment did the system actually render?

Replay’s task is to verify:

(request_hash, policy_version) ⇒ decision still holds

7.5 Policy Must Be Versioned, or Replay Is Meaningless

This is a necessary condition for replay—not a best practice.

If policy:

has no version
or has a version but mutable content
or cannot be hashed

Then replay collapses into a meaningless statement:

“That’s just how the system judged back then.”

You will never be able to answer:

What were the rules at the time?
Did the rules change later?
Did the difference come from input changes or policy changes?

Minimum requirements for policy versioning:

Every policy change produces:
- a policy_version
- a policy_hash
Every DecisionRecord must record:
- exactly which policy version was used

Without this, replay cannot be defined.

7.6 Replay Output Is Not “Pass/Fail,” but a Consistency State

Replay should output:

MATCH
→ recomputation under original inputs and policy yields the same decision
DRIFT
→ recomputation differs, indicating:
- unintended policy modification
- unstable canonicalization
- or an institutional gap in the original system

DRIFT is not an error—it is an institutional signal.

It means:

policy needs review
the canonical pipeline needs inspection
the impact of institutional evolution on historical judgments must be understood

7.7 The True Value of Replay: Making Institutions Accountable to Themselves

In many systems, history can only be narrated, not verified.

RRB’s replay mechanism introduces a rare capability:

The system can prove to itself, in the future, that its past judgments were not accidental—but institutionally necessary.

This means judgments were not:

emotional
lucky
or the result of “how the code happened to be written”

But instead:

under the rules and inputs of the time
any rational system would have reached the same conclusion

7.8 A Constitution-Ready Summary Sentence

Replay is not about rewinding time to rerun the system, but about taking past judgments and proving them again.
If institutions cannot be recomputed, history cannot be explained.

Replay 的定义必须正确

Replay 是整个 RRB 架构的“时间自证机制”。
如果 replay 被理解错了，系统看起来还在运行，但已经失去十年尺度的可解释性。
所以这一节的目标只有一个：把 replay 从“重跑系统”彻底钉死为“重算判断”。

7.1 Replay 的正确定义：重算当时的判断（Re-decision）

在 RRB 中，replay ≠ re-run。

Replay 的严格定义是：

在不引入任何新信息的前提下，
使用当时的制度版本与当时的规范化输入，
重新计算当时 Policy Gate 应该给出的判断，
并与历史中的 DecisionRecord 做一致性对比。

也就是说，replay 回答的不是：

“系统当年是怎么跑的？”

而是：

“在当年的规则与输入下，这个判断是否必然如此？”

7.2 为什么 replay 不是重跑流程（Re-run 是错的）

把 replay 理解成“重跑流程”，会立刻引入三个致命问题：

(1) 副作用污染

流程里包含：

git 操作
发布
部署
网络调用

这些不可、也不应该被重放。

如果 replay 要求“重跑执行”，你等于在问：

“我们要不要再发布一次，来证明当年发布过？”

这在逻辑上是荒谬的。

(2) 隐式状态漂移

流程依赖的隐式状态包括：

当前时间
当前环境
当前外部服务状态
当前 repo HEAD

这些状态：

当年存在
今天已经不存在或不同

一旦 replay 依赖它们，你就永远无法证明：

“当年的判断是否必然成立”

(3) replay 变成“历史复述”，而非“历史验证”

重跑流程只能告诉你：

当年大概发生了什么

却无法回答：

当年是否应该发生

而后者，才是 RRB 真正关心的。

7.3 Replay 的唯一真源：DecisionRecord

在 RRB 中，DecisionRecord 是 replay 的宪法级输入源。

Replay 的输入必须严格限定为：

DecisionRecord
决策时引用的 canonical inputs（通过 request_hash 定位）
决策时使用的 policy_version + policy_hash

明确禁止：

使用当前系统状态
使用当前 policy
使用当前 repo 状态
使用“当年没被记录的上下文”

否则你不是 replay，而是在事后合理化。

7.4 Replay 的最小输入集（为什么必须如此少）

Replay 的输入越多，解释空间就越大，责任就越模糊。

最小 replay 输入集恰好对应 DecisionRecord 的关键字段：

request_hash
→ 回答：当年裁决的是哪个对象？
policy_version (+ hash)
→ 回答：当年依据的是哪一套制度？
decision
→ 回答：当年系统实际给出了什么判断？

Replay 的任务是验证：

(request_hash, policy_version) ⇒ decision 是否仍然成立

7.5 Policy 必须版本化，否则 replay 没意义

这是 replay 成立的必要条件，不是最佳实践。

如果 policy：

没有版本号
或版本号但内容可变
或无法 hash

那么 replay 就会退化成一句空话：

“系统当年就是这么判断的。”

你永远无法回答：

当年规则是什么？
规则后来有没有变？
判断差异来自输入变化，还是规则变化？

版本化 policy 的最低要求：

每一次 policy 修改都生成：
- policy_version
- policy_hash
DecisionRecord 必须记录：
- 具体使用了哪一版 policy

否则，replay 不可定义。

7.6 Replay 的输出不是“成功/失败”，而是一致性状态

Replay 的输出应当是：

MATCH
→ 使用当年输入与制度，重算得到相同判断
DRIFT
→ 重算结果不同，说明：
- policy 被意外修改
- canonicalization 不稳定
- 或当年存在制度漏洞

DRIFT 不是错误，而是制度信号。

它意味着：

需要 review policy
需要检查 canonical pipeline
需要理解制度演化对历史判断的影响

7.7 Replay 的真正价值：让制度对自己负责

很多系统的历史只能被“讲述”，不能被“验证”。

RRB 的 replay 机制，带来一个非常罕见的能力：

系统可以在未来，向自己证明：当年的判断并非偶然，而是制度必然。

这意味着：

判断不是情绪
不是运气
不是当时“刚好这么写了代码”

而是：

在当时的制度与输入下
任何理性系统都会给出同样结论

7.8 一个可以直接写进宪法的总结句

Replay 不是把时间倒回去重跑系统，而是把判断拿出来重新证明。
如果制度无法被重算，历史就无法被解释。

禁止“隐式外部状态”污染主权判断

Prohibiting “Implicit External State” from Contaminating Sovereign Judgments

This section establishes the physical laws of sovereign judgment.
The one-sentence principle is:
Any state not explicitly written into the inputs must be treated as nonexistent.
Otherwise, a Decision ceases to be an institutional judgment and becomes an irreproducible accident.

8.1 What Is “Implicit External State”

Implicit external state refers to any information that is “conveniently read” in code but not declared as a formal input.

Typical examples include:

Current time (now(), datetime.utcnow())
Environment variables (ENV=prod, REGION=us-east-1)
Network state (HTTP calls, API availability, feature-flag services)
Local machine state (hostname, CPU cores, disk paths)
Runtime context (current branch, HEAD pointer, working directory)

They share one defining characteristic:

They existed at the time, but cannot be guaranteed to exist in the same way in the future.

8.2 Why Implicit External State “Contaminates” Sovereign Judgment

Sovereign judgment (the Policy Gate) must satisfy one core property:

Same inputs + same policy version ⇒ same judgment

The moment the Policy Gate reads implicit external state, this property is broken.

(1) Replay Immediately Fails

You can no longer answer:

What was ENV at the time of judgment?
What did the network call return at the time?
Which time window was active at the time?

As a result:

replay can only restate the outcome
but cannot recompute the judgment logic

(2) Responsibility Is Shifted to “the Environment”

When judgments depend on implicit state, responsibility subtly migrates:

“It wasn’t institutional judgment—it was just how the environment happened to be.”

This is institutionally unacceptable.

(3) Judgments Degenerate into Probabilistic Events

Implicit state is often:

volatile
unstable
unauditable

Once it enters the Policy Gate,

institutional judgment collapses from a deterministic function into probabilistic sampling.

8.3 Three Categories of State Explicitly Forbidden

(Unless Explicitly Included as Inputs)

🚫 Time

Forbidden:

if now() > freeze_window_end:
    allow()

Why:

Time is the most archetypal irrecomputable implicit state
If “what time it was” is not written into inputs, it is lost forever

Correct approach:

{
  "request": {...},
  "context": {
    "decision_time": "2025-12-27T21:00:00Z"
  }
}

Time may be an input—but only if explicitly declared and recorded.

🚫 Environment Variables

Forbidden:

if os.getenv("ENV") == "prod":
    deny()

Why:

Environment variables are often deployment accidents
They are not part of institutional rules, yet they influence judgments

Correct approach:

If “environment” is an institutional factor (e.g., prod freeze):
- include it as a request/context field
- record it in the DecisionRecord

🚫 Network / External Services

Forbidden:

if github_api.is_available():
    allow()

Why:

Network state cannot be guaranteed
API responses cannot be preserved long-term
Replay cannot reproduce them

Correct approach:

Network calls may occur only in:
- the proposal phase (as advisory input)
- or the execution phase (as a precondition for side effects)
They must never occur in the Policy Gate

8.4 A Crucial Distinction: Reading ≠ Referencing

RRB does not require the Policy Gate to be ignorant of the world,

but rather that it:

understands the world only through recorded references, not by “reading the world live.”

Allowed:

The Policy Gate may read:
- fields in the request
- evidence_refs pointing to already-recorded evidence (e.g., a CI result ID)

Not allowed:

making live requests
querying live system state

The difference:

referencing → replayable
reading → not replayable

8.5 Mandatory Engineering Measures

(Otherwise This Will Be Bypassed)

To ensure “no implicit state” is enforced in practice, not just in principle:

(1) Pure-Function Policy Gates

Policy functions must accept only:
- canonical_request
- policy_config
They must not access:
- filesystem
- environment variables
- network
- system clock

(2) Runtime Isolation (Strongly Recommended)

Run policy evaluation in a dedicated sandbox or container
Explicitly prohibit:
- outbound network access
- environment-variable access
Use tests to ensure:
- policies run correctly in a zero-environment context

(3) Replay Tests as “State-Contamination Detectors”

During replay:
- provide no external state
- if policy crashes or outputs differ → contamination is immediately exposed

Replay itself is the strongest tool for detecting implicit state dependencies.

8.6 A Counterintuitive but Critical Judgment

What you are doing here is essentially declaring:

Institutional judgment must live in a closed world.

This is not to simplify the system,

but to ensure the system remains explainable and accountable over time.

8.7 A Constitution-Ready Summary Sentence

The Policy Gate may see only the world explicitly written into its inputs.
Any state not recorded must be treated as nonexistent in institutional judgment.

If you like, next I can help you extend the final precautions, such as:

File-write discipline: overwrite vs. append
Or compressing “implicit state contamination” into a forbidden/allowed comparison table + Figma prompt, turning it into a one-page “institutional isolation diagram.”

禁止“隐式外部状态”污染主权判断

这一节是在给主权判断设“物理定律”。
一句话原则是：
凡是没有被显式写进输入的状态，都必须被视为不存在。
否则，Decision 就不再是制度判断，而是一次不可复现的偶然事件。

8.1 什么是“隐式外部状态”（Implicit External State）

隐式外部状态指的是：

在代码中被“顺手读取”、却没有作为正式输入声明的任何信息。

典型例子包括：

当前时间（now(), datetime.utcnow()）
环境变量（ENV=prod, REGION=us-east-1）
网络状态（HTTP 调用、API availability、feature flag 服务）
机器本地状态（hostname、CPU 核数、磁盘路径）
运行上下文（当前 branch、HEAD 指针、cwd）

它们有一个共同特征：

当年存在，但未来无法保证以同样方式存在。

8.2 为什么隐式外部状态会“污染”主权判断

主权判断（Policy Gate）必须满足一个核心性质：

同一组输入 + 同一版制度 ⇒ 同一判断

一旦 Policy Gate 读取了隐式外部状态，这个性质立刻被破坏。

(1) Replay 直接失效

你无法回答：

当年 policy 判断时，ENV 是什么？
当年网络请求返回了什么？
当年时间窗口处于哪个区间？

结果是：

replay 只能“复述判断结果”
却无法“重算判断逻辑”

(2) 决策责任被转移给“环境”

当判断依赖隐式状态时，责任会悄然转移：

“不是制度判断允许的，是当时环境刚好那样。”

这在制度语义上是不可接受的。

(3) 判断变成概率事件

隐式状态往往具有：

波动性
不稳定性
不可审计性

当它们进入 Policy Gate，

制度判断就从确定性函数，退化成概率采样。

8.3 明确禁止读取的三类状态（除非显式纳入输入）

🚫 当前时间（Time）

禁止：

if now() > freeze_window_end:
    allow()

为什么：

时间是最典型的不可重算隐式状态
“当时几点”如果没写进输入，就永远丢失

正确做法：

{
  "request": {...},
  "context": {
    "decision_time": "2025-12-27T21:00:00Z"
  }
}

时间可以作为输入，但必须被显式声明并记录。

🚫 环境变量（Environment）

禁止：

if os.getenv("ENV") == "prod":
    deny()

为什么：

环境变量往往是部署偶然性
它不是制度的一部分，却影响制度判断

正确做法：

如果“环境”是制度因素（例如 prod freeze）：
- 把它作为 request/context 字段
- 并写入 DecisionRecord

🚫 网络状态（Network / External Services）

禁止：

if github_api.is_available():
    allow()

为什么：

网络状态不可保证
API 响应不可长期保存
replay 时无法复现

正确做法：

网络调用只能发生在：
- proposal 阶段（作为建议）
- 或 execution 阶段（作为副作用前置条件）
绝不能发生在 Policy Gate

8.4 一个非常关键的区分：读取 ≠ 引用

RRB 并不是要求 Policy Gate 对世界一无所知，

而是要求它：

只能通过“被记录的引用”了解世界，而不能“现场读取世界”。

允许的方式：

Policy Gate 读取：
- request 中的字段
- evidence_refs 指向的已记录证据（如某次 CI 结果 ID）
不允许：
- 当场发请求
- 当场查系统状态

区别在于：

引用 → 可重放
读取 → 不可重放

8.5 工程层面的强制措施（否则一定会被绕过）

为了确保“禁止隐式状态”不是口号，必须在工程上做到：

(1) Policy Gate 纯函数化

Policy 函数签名只接受：
- canonical_request
- policy_config
不允许访问：
- filesystem
- environment
- network
- system clock

(2) 运行时隔离（强烈建议）

在单独的 sandbox / container 中运行 policy
明确禁止：
- outbound network
- env var access
用测试保证：
- policy 在无环境条件下仍能运行

(3) Replay 测试即“状态污染探测器”

在 replay 时：
- 不提供任何外部状态
- 若 policy 崩溃或输出变化 → 直接暴露污染点

Replay 本身，就是检测隐式状态依赖的最强工具。

8.6 一个反直觉但极其重要的判断

你现在做的事情，本质上是在说：

制度判断必须生活在一个“封闭世界”里。

这不是为了简化系统，

而是为了让系统在时间中保持可解释性与可追责性。

8.7 一个可以直接写进宪法的总结句

Policy Gate 只能看见被明确写进输入的世界。
凡是没有被记录的状态，在制度判断中都必须视为不存在。

文件写入纪律：覆盖 vs 追加

File Write Discipline: Overwrite vs. Append

This section legislates how the system writes time.
Write strategy is not an engineering detail—it is a direct expression of how you understand history, responsibility, and reconstructability.
One-sentence principle: Facts may only be appended; state may be overwritten, but must always be reconstructable from facts.

9.1 Two File Types, Two Views of Time (Must Be Strictly Distinguished)

In RRB (and across your entire ADK / MCP system), every persistent write must first be classified as one of the following:

Core conclusion:

If you are unsure whether a file should be overwritten—it must be append-only.

9.2 Ledger Files: Append-Only (JSONL)

Definition of a Ledger:

A ledger records the responsibilities and judgments the system has taken on over time,
not the system’s current state.

Typical ledger files include:

runtime_data/release_intents.jsonl
runtime_data/decisions.jsonl
runtime_data/overrides.jsonl
(optional) runtime_data/executions.jsonl

Hard rules:

Write mode: append-only
Format: JSONL (one fact per line, inherently time-ordered)
Forbidden actions:
- rewrite
- truncate
- in-place update
- “fixing old lines”

Why JSONL + append-only are mandatory:

Time is encoded naturally in order
No extra fields are needed to express “before vs. after”
Partial corruption is recoverable
Even if the last line is damaged, the core history remains intact
Minimal write primitive
Append is the least error-prone write operation

9.3 What If the Ledger Is Written Wrong?

Never “Edit”—Only “Supplement”

This is where many systems fail.

Incorrect practices:

Overwriting an old DecisionRecord
Deleting a DENY that “shouldn’t have happened”
Rewriting override reasons

Correct practice:

Append a new record that explicitly states:
- the original record ID
- why a correction is required
- what the new institutional judgment is
- how risk is newly accepted

This is not verbosity—it is institutional maturity.

9.4 State / Snapshot Files: Overwrite Allowed, but Must Be Reconstructable

The role of state / snapshot files is:

To give the system a fast, usable picture of “what things look like now” for startup, querying, and interaction.

Typical examples include:

runtime_data/memory_store.json
runtime_data/current_state.json
runtime_data/cache/*.json

Allowed actions:

overwrite
rewrite
compact
clean up

Non-negotiable prerequisites:

The state must be fully reconstructable from the ledger
The reconstruction path must be:
- defined
- executable
- testable

9.5 “Snapshot ≠ Fact”: Why `memory_store.json` Should Be a Materialized View

This point you raised is especially critical:

runtime_data/memory_store.json should be treated as a materialized view.

This means:

memory_store.json is not the memory ontology
It is:
- derived from ledgers / events / decisions
- via ETL / compaction
- into a “currently usable view”

Institutional implications:

memory_store.json may be:
- overwritten
- compacted
- rebuilt
But any important fact:
- must not exist only in memory_store
- must first be written to the ledger

Otherwise, you end up in a dangerous state:

The system “knows what it looks like now,”
but does not know how it became that way.

9.6 A Crucial Engineering Invariant (Ask Before Every Write)

Before writing any file, the system—or you—must answer this question:

“If this file were deleted, could I rebuild it using only the ledger?”

If no → this is a Ledger file, overwriting is forbidden
If yes → this is a Snapshot file, overwriting is allowed

This is the simplest and most effective classification test.

9.7 Recommended Directory and Naming Conventions

(Reducing Misuse by Structure)

To lower the chance of mistakes at the structural level:

runtime_data/
  ledger/
    release_intents.jsonl
    decisions.jsonl
    overrides.jsonl
    executions.jsonl
  snapshots/
    memory_store.json
    current_state.json
  canonical/
    requests/
    intents/

Structure is institution:

Let the path names themselves remind developers whether overwriting is allowed.

9.8 A Counterintuitive but Critical Judgment

In long-lived systems:

Overwriting is not dangerous—non-reconstructability is.

You allow overwriting snapshots because you trust the ledger.

You forbid overwriting the ledger because you respect time.

9.9 A Constitution-Ready Summary Sentence

Facts may only be appended, never rewritten; state may be overwritten, but must be reconstructable from facts.
Whoever controls write semantics controls the system’s attitude toward time.

文件写入纪律：覆盖 vs 追加

这一节是在给系统的“时间写入方式”立法。
写入策略不是工程细节，而是你如何看待历史、责任与可重建性的直接体现。
一句话原则：事实只能追加，状态可以覆盖，但必须可从事实中重建。

9.1 两类文件，两种时间观（必须严格区分）

RRB（以及你整个 ADK / MCP 体系）里，所有持久化写入都必须先被分类为以下两种之一：

核心结论：

如果你不确定某个文件该不该覆盖——它就必须是 append-only。

9.2 Ledger 类文件：只允许 append（JSONL）

Ledger 的本体定义：

Ledger 记录的是系统在时间中承担过的责任与判断，
而不是“当前系统状态”。

典型 Ledger 文件包括：

runtime_data/release_intents.jsonl
runtime_data/decisions.jsonl
runtime_data/overrides.jsonl
（可选）runtime_data/executions.jsonl

硬性纪律：

写入方式：append-only
格式：JSONL（一行一条事实，天然时间序）
禁止行为：
- rewrite
- truncate
- in-place update
- “修正旧行ศจ”

为什么必须是 JSONL + append-only：

时间天然编码在顺序里
不需要额外字段就能表达“先后发生”
部分损坏可恢复
即使最后一行损坏，历史主体仍然存在
最小写入原语
append 是最不容易被误用的写操作

9.3 Ledger 写错怎么办？——永远不要“改”，只允许“补充”

这是很多系统最容易犯错的地方。

错误做法：

覆盖旧 DecisionRecord
删除“看起来不该发生的” DENY
重写 override 原因

正确做法：

追加一条新的记录，明确说明：
- 原记录 ID
- 为什么需要修正
- 新的制度判断是什么
- 风险如何被重新接受

这不是啰嗦，而是制度成熟的表现。

9.4 State / Snapshot 类文件：允许覆盖，但必须可重建

State / Snapshot 的角色是：

给系统一个“现在看起来怎么样”的快照，以便快速启动、查询和交互。

典型文件包括：

runtime_data/memory_store.json
runtime_data/current_state.json
runtime_data/cache/*.json

允许行为：

覆盖
重写
压缩
清理

前提条件（缺一不可）：

该状态必须能从 Ledger 中完全重建
重建路径必须：
- 已定义
- 可执行
- 可测试

9.5 “快照 ≠ 事实”：为什么 memory_store.json 应定位为 Materialized View

你特别点到的这一条非常关键：

建议把 runtime_data/memory_store.json 定位为快照（materialized view）

这意味着：

memory_store.json 不是记忆本体
它是：
- 从 ledger / events / decisions 中
- 通过 ETL / compaction
- 生成的“当前可用视图”

制度后果：

memory_store.json 可以：
- 被覆盖
- 被压缩
- 被重建
但任何“重要事实”：
- 不得只写在 memory_store
- 必须先写入 ledger

否则你会得到一个危险状态：

系统“记得现在是什么样”，
却不知道“自己是怎么变成这样的”。

9.6 一个非常重要的工程不变量（写入前必问）

在写任何文件之前，系统（或你自己）都必须先回答这个问题：

“如果这个文件被删了，我能不能只靠 ledger 把它重建出来？”

如果不能 → 这是 Ledger，禁止覆盖
如果能 → 这是 Snapshot，可以覆盖

这是最简单、也最有效的分类判据。

9.7 推荐的目录与命名约定（降低误用概率）

为了在结构层面减少犯错概率，建议：

runtime_data/
  ledger/
    release_intents.jsonl
    decisions.jsonl
    overrides.jsonl
    executions.jsonl
  snapshots/
    memory_store.json
    current_state.json
  canonical/
    requests/
    intents/

结构即制度：

让路径名本身就提醒开发者“这里能不能覆盖”。

9.8 一个反直觉但极其重要的判断

在长期系统里：

覆盖不是危险的，不可重建才是危险的。

你允许覆盖 snapshot，

是因为你对 ledger 有信心；

你禁止覆盖 ledger，

是因为你尊重时间。

9.9 一个可以直接写进宪法的总结句

事实只能被追加，不能被重写；状态可以被覆盖，但必须能从事实中重建。
谁控制了写入方式，谁就控制了系统对时间的态度。

权限与最小信任（Least Privilege）

Permissions and Least Privilege

This section maps out who is allowed to do what in the system.
There is only one core principle:
No component should ever be trusted enough to change history on its own.
Capabilities must be split, constrained, and recomposed through institutions.

10.1 The True Meaning of Least Privilege: Distrust Paths, Not People

In the RRB system, “least privilege” is not a moral judgment about developers, bots, or services. It is a path-level constraint principle:

Any path that can directly write history or produce side effects must be intercepted by institutions.

In other words:

We do not assume any module “won’t misbehave”
We only allow it to act along explicitly permitted paths

10.2 RRB Itself Must Be a “Low-Trust Component”

This is counterintuitive but critically important:

RRB is not the system’s god—it is one of the components that must never be fully trusted.

Accordingly, RRB must be designed so that it:

❌ cannot write to the ledger directly
❌ cannot adjudicate policy on its own
❌ cannot bypass the Capability Bus / Policy Gate
❌ cannot execute side effects directly

The only things it may do are:

translate Intent → Request
call the governance kernel through standard interfaces
receive institutional outputs (DecisionRecord / Ticket)

Institutional implication:

Even if RRB is buggy, compromised, or abused, it cannot become a single point of historical failure.

10.3 Power Must Be Split (Separation of Powers)

To make least privilege real, critical capabilities must be horizontally separated:

No single module ever holds the complete chain of:

judgment + historical write + execution

This is the most important safety structure in long-lived systems.

10.4 Why RRB Must Never Write Directly to the Ledger

Allowing RRB to write directly to the ledger would mean:

Allowing an “orchestrator” to declare something a historical fact without institutional adjudication.

This would immediately break:

the meaning of append-only
the credibility of replay
the human responsibility anchor

The only correct path is:

RRB → Capability Bus → Policy Gate → Ledger

—not:

RRB → Ledger

10.5 Why All Side-Effect Tools Must Be Centralized in the Execution Layer

Side effects (tag / deploy / publish) are the only actions that truly change the world.

Therefore, they must obey three disciplines:

Centralization
- All side-effect tools exist only in the Executor layer
- They must not be scattered across RRB, Policy, or Scheduler
Ticketization
- Every side effect must be bound to a DecisionTicket
- Executors accept no “raw calls”
Auditability
- Success or failure must leave append-only records

Institutional meaning:

At any time, you can answer:
- Who authorized this side effect?
- Under which institutional rules?
- Was the side effect executed correctly?

10.6 A Key Benefit of “Low Trust”: Local Failure Does Not Destroy History

Under a least-privilege architecture:

RRB crashes → cannot write history
Executor has a bug → cannot execute without a ticket
Policy is wrong → replay exposes it
Snapshot is corrupted → ledger can rebuild it

The system is designed so that:

Local untrustworthiness does not compromise global trustworthiness.

This is the capability you need to build a “ten-year system.”

10.7 Hard Engineering Constraints (Not Verbal Agreements)

To make least privilege non-bypassable, engineering must enforce:

Ledger write functions:
- exposed only in adk_runtime/governance/*
- no write permissions in service layers
Policy Gate:
- exposes no “ALLOW shortcut” APIs
Executor:
- all side-effect functions are private
- the only public method is execute(ticket)

10.8 A Constitution-Ready Summary Sentence

System safety does not come from the intelligence or goodwill of any component, but from the fact that—even if it fails—it cannot change history on its own.
Least privilege is not about limiting capability; it is about protecting time.

权限与最小信任（Least Privilege）

这一节是在给系统分配“谁能做什么”的权力版图。
核心原则只有一个：
任何组件，都不应被信任到可以单独改变历史。
能力必须被拆分、约束、并通过制度重新组合。

10.1 最小信任的真正含义：不是不信任人，而是不信任路径

在 RRB 体系中，“最小信任”并不是针对某个开发者、bot 或服务的道德判断，而是一个路径级约束原则：

任何一条能够直接写入历史或产生副作用的路径，都必须被制度拦截。

换句话说：

我们不假设某个模块“不会乱来”
我们只允许它在被允许的路径上行事

10.2 RRB 本身必须是“低信任组件”

这是一个反直觉但非常关键的判断：

RRB 不是系统的上帝，而是系统里最不该被完全信任的组件之一。

因此，RRB 必须被设计成：

❌ 不能直接写 ledger
❌ 不能直接判定 policy
❌ 不能绕过 Capability Bus / Policy Gate
❌ 不能直接执行副作用

它唯一能做的，是：

翻译 Intent → Request
调用治理内核的标准接口
接收制度输出（DecisionRecord / Ticket）

制度含义：

即使 RRB 被写错、被攻击、被滥用，它也无法单点破坏历史。

10.3 权力必须被拆分（Separation of Powers）

为了让最小信任成立，系统中的关键能力必须横向拆分：

没有任何一个模块：

同时拥有“判断 + 写历史 + 执行”的完整链条

这正是长期系统里最重要的安全结构。

10.4 为什么 RRB 不能直接写 Ledger

如果允许 RRB 直接写 Ledger，就等于：

允许“一个编排器”在没有制度裁决的情况下，
把某件事宣称为历史事实。

这会立刻破坏：

append-only 的意义
replay 的可信性
人类责任锚点

正确路径永远是：

RRB → Capability Bus → Policy Gate → Ledger

而不是：

RRB → Ledger

10.5 为什么任何副作用工具必须集中在执行层

副作用（tag / deploy / publish）是系统里唯一真正“改变世界”的动作。

因此它们必须满足三条纪律：

集中
- 所有副作用工具只能存在于 Executor 层
- 禁止散落在 RRB、Policy、Scheduler 中
票据化
- 每一次副作用都必须绑定 DecisionTicket
- Executor 不接受任何“裸调用”
可审计
- 执行成功或失败都必须留下 append-only 记录

制度含义：

你随时能回答：
- 谁授权了这次副作用？
- 在什么制度下授权？
- 副作用是否被正确执行？

10.6 “低信任”带来的一个关键好处：局部失败不会毁掉历史

在最小信任架构下：

RRB 崩溃 → 不能写历史
Executor 出 bug → 没有票据就无法执行
Policy 写错 → replay 能暴露
Snapshot 损坏 → ledger 可重建

系统被设计成：

局部不可信，整体仍然可信。

这是你要构建“十年系统”的关键能力。

10.7 工程层面的硬约束（不是口头约定）

为了让“最小信任”不可绕过，必须在工程上强制：

Ledger 写入函数：
- 只暴露在 adk_runtime/governance/*
- service 层没有写权限
Policy Gate：
- 不对外暴露“ALLOW shortcut”
Executor：
- 所有副作用函数都是 private
- 唯一 public 方法是 execute(ticket)

10.8 一个可以直接写进宪法的总结句

系统的安全性，不来自于某个组件的聪明或善意，而来自于：即使它出错，也无法独自改变历史。
最小信任不是限制能力，而是保护时间。

失败也是资产（DENY / OVERRIDE 也是历史）

Failure Is an Asset (DENY / OVERRIDE Are Also History)

This section overturns one of the deepest illusions in engineering culture:
Only success is worth recording.
In RRB, the opposite is true — rejection and override are the most important signals an institution has.

11.1 DENY Is Not “Nothing Happened,” but “A Rejection Explicitly Occurred”

In traditional pipelines, DENY often means:

the workflow stopped,
nothing was written,
next time we try again.

In RRB:

DENY is a complete, explicit, and accountable institutional act.

It means:

a responsible subject requested to advance history,
under the rules in effect at that time,
the institution explicitly judged that it should not happen.

This is not a blank outcome — it is the institution speaking.

11.2 Why DENY Must Be Written into History

If DENY is not recorded, three dangerous things happen:

(1) The Institution Appears “Always Correct”

If only ALLOW is recorded:

institutional failure and boundaries become invisible,
the true operating range of policy is hidden.

You lose the ability to see:

which requests were blocked,
why they were blocked,
whether policy is too strict or too permissive.

(2) Humans Will Repeatedly Make the Same Mistakes

If DENY leaves no trace:

the next responsible subject will try again,
the same request will be submitted repeatedly,
the system is forever “making the same mistake for the first time.”

DENY records are, in essence, organizational memory.

(3) Replay Loses Half of Its Reference Frame

Replay is not only about validating ALLOW decisions;

it must also validate:

whether a rejection was inevitable at the time.

Without DENY, replay is only half complete.

11.3 The Institutional Value of DENY: Boundaries, Warnings, and Teaching

Every DENY answers a single question:

“Under the current institution, what is not allowed?”

From an institutional perspective, DENY provides:

boundary samples — where policy actually draws the line,
risk signals — which change patterns are approaching red lines,
teaching material — why this class of requests is unacceptable.

A system with no DENY records either:

is not being seriously used, or
has a policy that has already failed.

11.4 OVERRIDE: Expensive, Dangerous, and Necessary to Expose

An override is not a bug fix;

it is an institutional override action.

Its semantics are:

“Although the current policy judges this as not allowed,
someone or some institution chooses to accept additional risk
and force history forward.”

Therefore, overrides must be treated as expensive actions.

11.5 Why Overrides Must Be Fully Recorded

If overrides are not recorded, the consequences are catastrophic:

the original policy is negated by facts with no trace,
the responsible subject disappears from history,
institutional evolution loses its evidence chain.

A proper override record must answer four questions:

Who decided to override? (by)
Why is the original judgment no longer applicable? (reason)
When did it occur? (timestamp)
What risk was accepted? (risk_acceptance)

Missing any of these turns an override from an institutional act into a silent bypass.

11.6 Overrides Must Trigger Policy Review (Otherwise They Are Meaningless)

The real value of an override is not “making something happen,”

but exposing pressure points in the institution.

Therefore, RRB must enforce a hard coupling:

Every override must trigger a policy review.

This may take the form of:

automatic policy review tasks,
prominent flags in governance dashboards,
mandatory review outcomes (maintain / revise / retire policy).

Without this, overrides degrade into:

routine escape hatches,
silent erosion of institutional authority.

11.7 A Critical Counterintuitive Judgment

In short-term engineering culture:

failure is waste,
rejection is obstruction.

In long-lived institutional systems:

A system with no recorded failures is a system that does not learn.

The existence of DENY and OVERRIDE means:

the institution is tested by reality,
the system is friction-tested against the world,
the organization is explicitly learning its boundaries.

11.8 Hard Engineering Constraints

(So Failure Cannot Be Ignored)

To ensure that “failure is an asset,” the system must enforce:

DecisionLedger:
- equal treatment of ALLOW / DENY / OVERRIDE
UI / CLI:
- explicit visibility of rejected history
Replay / Regression:
- coverage of DENY and OVERRIDE cases
Metrics:
- monitoring DENY / OVERRIDE ratios as institutional health signals

11.9 A Constitution-Ready Summary Sentence

DENY is a declaration of institutional boundaries; OVERRIDE is a stress test of the institution.
A system that does not record failure will ultimately lose its history to failure.

失败也是资产（DENY / OVERRIDE 也是历史）

这一节是在反转工程文化中最深的一种错觉：
只有成功才值得被记录。
在 RRB 里，恰恰相反——被拒绝与被越权，才是制度最重要的信号源。

11.1 DENY 不是“没发生”，而是“明确发生了一次拒绝”

在传统流水线中，DENY 往往意味着：

流程没走下去
什么也没留下
下一次重新来过

而在 RRB 中：

DENY 是一次完整、明确、可追责的制度行为。

它意味着：

某个责任主体提出了推进历史的请求
制度在当时的规则下
明确判断“不应当发生”

这不是空白，而是制度的发声。

11.2 为什么 DENY 必须被写入历史

如果 DENY 不被记录，会发生三件危险的事：

(1) 制度看起来“永远正确”

如果只记录 ALLOW：

制度失败与边界永远不可见
policy 的真实作用范围被隐藏

你无法知道：

哪些请求被挡下
挡下的原因是什么
制度是否过于保守或过于宽松

(2) 人类会不断重复同样的错误

如果 DENY 不留痕：

下一位责任主体会再次尝试
同样的 request 会被一遍遍提交
系统永远在“第一次犯错”

DENY 记录，本质上是组织的记忆。

(3) Replay 失去参照物

Replay 不只是验证 ALLOW 的合理性，

也必须能验证：

当年拒绝是否必然成立。

没有 DENY，replay 只剩下一半。

11.3 DENY 的制度价值：边界、警告与教学

每一条 DENY 都在回答一个问题：

“在当前制度下，什么是不被允许的？”

从制度角度看，DENY 是：

边界样本：policy 的真实边界在哪里
风险信号：哪些变更模式正在逼近红线
教学材料：为什么这类请求不可接受

一个没有 DENY 的系统，要么是：

没有人真正使用它
要么 policy 已经失效

11.4 OVERRIDE：昂贵、危险、但必须被暴露

Override 不是修正 bug，

而是一次制度越权行为。

它的语义是：

“在当前制度判断为不允许的情况下，
有人/机构选择承担额外风险，强行推进历史。”

因此 override 必须被视为昂贵行为。

11.5 为什么 override 必须被完整记录

如果 override 不被记录，后果是灾难性的：

原 policy 被事实否定，却没有痕迹
责任主体消失在历史中
制度演化失去证据

正确的 override 记录，必须回答四个问题：

谁决定越权？（by）
为什么 原判断不再适用？（reason）
什么时候 发生？（timestamp）
接受了什么风险？（risk_acceptance）

少任何一个字段，override 就不是制度行为，而是偷偷放行。

11.6 Override 必须触发 Policy Review（否则毫无意义）

Override 的真正价值不在于“让事情发生”，

而在于它暴露了制度的承压点。

因此，RRB 必须内建一个硬性联动：

每一次 override，都必须触发 policy review。

这可以表现为：

自动生成 policy review task
在治理 dashboard 中标红
强制要求 review 结论（维持 / 修订 / 废止 policy）

否则 override 会退化成：

日常逃生通道
制度被悄悄掏空

11.7 一个重要的反直觉判断

在短期工程文化里：

失败是浪费
被拒绝是阻碍

而在长期制度系统里：

没有失败记录的系统，是一个不学习的系统。

DENY 和 OVERRIDE 的存在，意味着：

制度在真实世界中被测试
系统在与现实摩擦
组织在显式学习自己的边界

11.8 工程层面的强制约束（让失败无法被忽略）

为了确保“失败也是资产”，系统必须做到：

DecisionLedger：
- 同等对待 ALLOW / DENY / OVERRIDE
UI / CLI：
- 明确展示被拒绝的历史
Replay / Regression：
- 覆盖 DENY 与 OVERRIDE
Metrics：
- 监控 DENY / OVERRIDE 比例变化（制度健康信号）

11.9 一个可以直接写进宪法的总结句

DENY 是制度的边界声明，OVERRIDE 是制度的压力测试。
一个不记录失败的系统，最终会在失败中失去历史。

逐步产品化路线

10) Progressive Productization Path (From Demo to Organ)

This section is not about “how to ship faster,” but about how to avoid killing a long-term system in its early stages.
Your goal is not a flashy demo, but an organ-level system that can still stand ten years from now.
Therefore, productization must grow outward from the institutional core, not pile features first and retrofit governance later.

10.1 Core Principle: Make the System Responsible to Time First, Then Allow It to Act on the World

At its core, the productization order of RRB answers one question:

When does a system earn the right to change the world?

The answer is:

Only after it can correctly record, explain, and recompute its own judgments.

Phase 0 — CLI + Ledger Only (Institutional Skeleton Phase)

Goal:

Give the system a minimal institutional closed loop—valuable even if it executes nothing.

Implementation:

CLI: rrb intent create
Intent → Request → Policy → DecisionRecord
All outputs written only to:
- release_intents.jsonl
- decisions.jsonl
- overrides.jsonl
No side effects at all
- ❌ No tagging
- ❌ No publishing
- ❌ No deployment

Why this phase is critical:

It forces you to get the following right first:
- Intent schema
- canonicalization
- policy versioning
- append-only ledger

System state:

“Even if you stop here, this system is already more mature than 90% of CI/CD pipelines.”

Phase 1 — Ticket-Gated Tag (First Contact with the World)

Goal:

Introduce the smallest, reversible, lowest-risk side effect.

Implementation:

Allow only one side effect:
- git tag
Executor must:
- accept only DecisionTicket
- strictly validate request_hash / policy_ref
Execution results written to:
- executions.jsonl (append-only)

Why start with tagging:

Tags are:
- reversible
- auditable
- non-disruptive to runtime systems
They are the ideal proving ground for ticket-gated execution

System state:

“The system touches the world for the first time—without becoming dangerous.”

Phase 2 — Release / Publish via Plugins (Controlled Expansion)

Goal:

Introduce more side effects without breaking the institutional core.

Implementation:

Executor becomes plugin-based:
- TagPlugin
- GitHubReleasePlugin
- PackagePublishPlugin
Each plugin must:
- explicitly declare its side-effect scope
- define clear failure semantics
- pass uniformly through the ticket gate
Plugins must not:
- read the ledger directly
- adjudicate policy on their own

Institutional requirement:

Plugins care only about:
- “Give me a ticket, I will perform one action”
Plugins do not know:
- why it was allowed
- who accepted the risk

System state:

“Capabilities grow, but sovereignty remains centralized.”

Phase 3 — Deploy / Infrastructure Effects (High-Risk Zone)

Goal:

Introduce irreversible side effects that impact real users.

Prerequisites (all mandatory):

Mature override mechanism
Established policy-review loop
Stable replay / regression
Humans have internalized DENY / OVERRIDE semantics

Implementation:

Deploy / infrastructure actions via plugins
Mandatory:
- pre-execution confirmation
- post-execution auditing
Explicitly defined:
- blast radius
- rollback strategy

System state:

“The system is now an organ, not a tool.”

10.2 Why Sovereign Write Entry Points Must Be Closed First

This is a point you emphasized—and rightly so.

Sovereign write entry points include:

ledger append
policy decision
ticket issuance

These capabilities must exist only in:

adk_runtime/governance/*

They must never be delegated to:

service layers
plugin layers
CLI layers

Otherwise, the classic disaster unfolds:

features multiply
sovereignty fragments
no one knows who has the authority to change history

10.3 A Critical Anti-Pattern (Must Be Avoided)

“Let’s build a usable demo first, and add governance later.”

This is the starting point of almost every failed long-term system.

Because:

demos become de facto standards
de facto standards become culture
culture resists any constraints added later

You are deliberately taking the opposite path:

write constraints into the system first
then gradually release capabilities

This is rare—and correct.

10.4 A Roadmap-Ready Judgment Sentence

Productization is not the linear accumulation of capabilities, but the gradual expansion of the radius of responsibility.

10.5 The Final State: What an “Organ-Level System” Means

When RRB completes this path, it exhibits the following properties:

Every release:
- has an Intent
- has a DecisionRecord
- is replayable
Every side effect:
- has a ticket
- is accountable
Every institutional change:
- has history
- can be explained

At that point, RRB is no longer a bot,

but an organ that takes responsibility for time within your system.

10.6 A Constitution-Ready Summary Sentence

Learn to record before you learn to act; take responsibility for time before you act on the world.

10) 逐步产品化路线（从 demo 收敛到器官）

这一节解决的不是“怎么快点做出来”，而是怎么避免把一个长期系统，在早期就做死。
你的目标不是一个炫目的 demo，而是一个能在十年后仍然站得住的器官级系统。
因此，产品化必须是从制度内核向外生长，而不是从功能堆砌向内回补。

10.1 核心原则：先让系统“对时间负责”，再让它“对世界动手”

RRB 的产品化顺序，本质上是在回答一个问题：

系统什么时候才有资格改变世界？

答案是：

只有当它已经能正确记录、解释和重算自己的判断之后。

Phase 0 — CLI + Ledger Only（制度骨架期）

目标：

让系统具备最小制度闭环，即使什么都不执行，也已经有价值。

实现内容：

CLI：rrb intent create
Intent → Request → Policy → DecisionRecord
所有输出只写：
- release_intents.jsonl
- decisions.jsonl
- overrides.jsonl
不做任何副作用
- ❌ 不打 tag
- ❌ 不发布
- ❌ 不 deploy

为什么这一步非常重要：

它迫使你把：
- Intent schema
- canonicalization
- policy versioning
- append-only ledger
全部先做对

系统状态：

“即使你现在停在这里，这个系统已经比 90% 的 CI/CD 更成熟。”

Phase 1 — Ticket-Gated Tag（第一次触碰世界）

目标：

引入最小、可逆、低风险的副作用。

实现内容：

只允许一个副作用：
- git tag
Executor 必须：
- 只接受 DecisionTicket
- 严格校验 request_hash / policy_ref
执行结果写入：
- executions.jsonl（append-only）

为什么从 tag 开始：

tag 是：
- 可逆的
- 可审计的
- 不影响运行时系统
它是测试 ticket-gated execution 的理想对象

系统状态：

“系统第一次改变世界，但仍然不危险。”

Phase 2 — Release / Publish 插件化（受控扩展）

目标：

在不破坏制度内核的前提下，引入更多副作用。

实现内容：

Executor 插件化：
- TagPlugin
- GitHubReleasePlugin
- PackagePublishPlugin
每个插件：
- 明确声明副作用范围
- 明确失败语义
- 统一走 ticket gate
插件不能：
- 自己读取 ledger
- 自己判定 policy

制度要求：

插件只关心：
- “给我一张票，我做一件事”
插件不知道：
- 为什么允许
- 谁承担风险

系统状态：

“能力在增长，但主权仍然收口。”

Phase 3 — Deploy / Infra Effects（高风险区）

目标：

引入不可逆、影响现实用户的副作用。

前置条件（缺一不可）：

override 机制已成熟
policy review 回路已存在
replay / regression 稳定运行
人类对 DENY / OVERRIDE 已形成心智习惯

实现内容：

deploy / infra 操作插件化
强制：
- 执行前确认
- 执行后审计
明确：
- blast radius
- rollback strategy

系统状态：

“系统已经是器官，而不是工具。”

10.2 为什么必须“先收口主权写入入口”

这是你在这条路线里强调得非常对的一点。

主权写入入口指的是：

ledger append
policy decision
ticket issuance

这些能力必须只存在于：

adk_runtime/governance/*

在任何产品化阶段都不得下放到：

service 层
plugin 层
CLI 层

否则你会遇到经典灾难：

功能越做越多
主权越来越散
最后没人知道“谁有权改变历史”

10.3 一个关键的反模式（务必避免）

“先做一个能用的 demo，以后再补治理。”

这是几乎所有长期系统失败的起点。

因为：

demo 会变成事实标准
事实标准会变成文化
文化会抵抗任何“后来补的约束”

你现在反其道而行：

先把约束写成系统
再慢慢释放能力

这是极其罕见、但正确的路径。

10.4 一个可以写进路线图的判断句

产品化不是能力的线性叠加，而是责任半径的逐步扩大。

10.5 最终状态：什么叫“器官级系统”

当 RRB 完成这条路线时，它具备以下特征：

任何一次发布：
- 都有 Intent
- 都有 DecisionRecord
- 都可 replay
任何副作用：
- 都有 ticket
- 都可追责
任何制度调整：
- 都有历史
- 都能解释

这时，RRB 不再是一个 bot，

而是你系统里一个会对时间负责的器官。

10.6 一个可以直接写进宪法的总结句

先学会记录，再学会行动；先对时间负责，再对世界动手。

Susan STEM’s Entropy Control Theory

Discussion about this post

Ready for more?

Susan STEM’s Entropy Control Theory

White Paper: Repo Release Bot

Repo Release Bot架构白皮书

Design Theorems (设计定理)

0) Design Theorems (Fix These First)

0.1 RRB Does Not Judge, and Is Not “Smart”: It Only Translates and Orchestrates

0.2 Triggers Must Be Intent: Intent ≠ Event

0.3 No LLMs in the Sovereign Zone:

0.4 Execution Must Be Ticketized:

Summary: The System Properties of the Four Theorems Together

0) 设计定理（先钉死）

0.1 RRB 不裁决、不聪明：只做翻译与执行（RRB = Translation + Orchestration, not Judgment）

0.2 触发必须是 Intent：Intent ≠ Event（Trigger is Responsibility, not Activity）

0.3 主权区禁止 LLM：Policy/Scheduler/Ledger/Canonical Memory 必须无模型可重算（Sovereign Zone = Deterministic）

0.4 执行必须票据化：Executor 只认 DecisionTicket（ticket-gated execution）

总结：四条定理合在一起的系统性质

Data Objects 数据对象

2) Data Objects (The Four-Piece Set)

2.1 ReleaseIntent v1 (Responsibility Declaration, Legitimate Trigger)

2.2 CapabilityRequest: repo.release (Adjudicable Request)

2.3 DecisionTicket (Execution Pass)

2.4 DecisionRecord / OverrideRecord

The Causal Closed Loop of the Four Pieces (One Sentence)

2) 数据对象（四件套）

2.1 ReleaseIntent v1（责任声明，合法触发）

2.2 CapabilityRequest: repo.release（可裁决请求）

2.3 DecisionTicket（执行通行证）

2.4 DecisionRecord / OverrideRecord（制度事实，append-only，replay 真源）

四件套的因果闭环（一句话把链条钉死）

3) Main Flow (From Intent to History, Then Optional Execution)

3.1 Phase Structure: Three Zones, Five Steps

3.2 End-to-End Flow (High-Density Version)

(1) Intent Declared: Responsibility Enters the System (The Only Legitimate Trigger)

(2) Proposal Build: Generate Candidate Requests (LLM Only Here)

(3) Compile & Validate: Deterministic Compilation

(4) Capability Bus → Policy Gate: Single Legal Entry for Adjudication

(5) Ledger Append: Write Institutional Facts First, Then Allow Side Effects

(6) Ticket-Gated Execution: Authorized Execution

(7) Replay / Regression: Recompute Judgments from Records

3.3 Two Critical “Institutional Locks”

3.4 Minimal v0 Execution Strategy

Main Flow (主流程)

3) Main Flow (From Intent to History, Then Optional Execution)

3.1 Phase Structure: Three Zones, Five Steps

3.2 End-to-End Flow (High-Density Version)

(1) Intent Declared: Responsibility Enters the System (The Only Legitimate Trigger)

(2) Proposal Build: Generate Candidate Requests (LLM Only Here)

(3) Compile & Validate: Deterministic Compilation

(4) Capability Bus → Policy Gate: Single Legal Entry for Adjudication

(5) Ledger Append: Write Institutional Facts First, Then Allow Side Effects

(6) Ticket-Gated Execution: Authorized Execution

(7) Replay / Regression: Recompute Judgments from Records

3.3 Two Critical “Institutional Locks”

3.4 Minimal v0 Execution Strategy

3) 主流程（从 Intent 到历史，再到可选执行）

3.1 阶段划分：三域五步（Proposal / Sovereign / Execution）

3.2 端到端流程（高密度版）

(1) Intent Declared：责任声明入账（唯一合法触发）

(2) Proposal Build：生成候选请求（LLM only here）

(3) Compile & Validate：确定性编译（把提案变成可裁决请求）

(4) Capability Bus → Policy Gate：单一合法入口裁决（制度判断）

(5) Ledger Append：先写制度事实，再允许副作用（record-before-effects）

(6) Ticket-Gated Execution：票据化执行（副作用区）

(7) Replay / Regression：以 record 为真源重算裁决（不是重跑流程）

3.3 两条关键“制度锁”（把主流程变成不可绕过）

3.4 最小 v0 执行策略（让你能立刻落地而不冒险）

Triggering Mechanism(触发机制)

4) Triggering Mechanism

4.1 The Only Legitimate Trigger: ReleaseIntent

4.2 Legitimate Trigger Entry Points (v0: Most Stable)

✅ CLI: rrb intent create ... (Human Explicit)

🔜 Extensible Entry Points (Semantics Must Be Equivalent)

PR Comment: /release

UI Button (Approval Desk / Console)

4.3 Explicitly Forbidden Triggers (Red Lines)

🚫 Automatic Release on Merge

🚫 Automatic Trigger on CI Green

🚫 Agent-Automated Triggering

4.4 A Critical Structural Judgment (Unique to This System)

4.5 One-Sentence Summary (Constitution-Ready)

2.2 CapabilityRequest: `repo.release` (Adjudicable Request)

2.2 CapabilityRequest: `repo.release`（可裁决请求）

✅ CLI: `rrb intent create ...` (Human Explicit)

PR Comment: `/release`

✅ CLI：`rrb intent create ...`（Human Explicit）

PR Comment：`/release`

(1) Decision Must Be `ALLOW`

(2) `request_hash` Must Match

(3) `policy_version (+ hash)` Must Match

(1) Decision 必须是 `ALLOW`

(2) `request_hash` 必须匹配

(3) `policy_version (+ hash)` 必须匹配

1.4 Mandatory Metadata: `origin` and `replayable`

`origin`

`replayable`

1.4 强制元数据标记：`origin` 与 `replayable`

`origin`

`replayable`