refactor(server): scene 注册表骨架 + 统一门控管道设计文档

Phase 1 P1.1-P1.2：为后端门控收口提供声明式场景注册基础设施。 - 新建 scenes/ 目录：gate_rules（GateRule/SceneRoute 枚举）、scene_descriptor（SceneDescriptor dataclass）、scene_registry（SceneRegistry 单例） - 3 个场景迁入 descriptor：expense_application / reimbursement / query_travel_standard - __init__.py 的 bootstrap_scenes 在 import 时注册 + 运行时绑定 handler/builder/executor（解决循环 import） - 查询场景 priority=50 优先于 MODEL_ONLY 场景，确保规则匹配先于 LLM - 落地 UNIFIED_GATE_PIPELINE.md 架构文档：目标架构 / 验收标准（接入 O(1)）/ 3 阶段迁移路径 - 76 passed，scene 注册表未破坏现有代码；与 intent_registry 暂时并存，P1.3-P1.8 会统一迁移
2026-06-25 15:09:16 +08:00
parent e9d7c56d5b
commit 54356ba81a
9 changed files with 684 additions and 0 deletions
--- a/document/work-log/2026-06-25.md
+++ b/document/work-log/2026-06-25.md
@@ -56,6 +56,33 @@
 - 22:40：`server/rules/finance-rules/` 下有两个 Excel(交通工具等级标准、交通费用预估表)被标记为 modified,疑似容器运行时产物,非本次代码改动,未处理。
 - 22:40：`agent-change-log` Skill 在当前环境不可调用,已按 AGENTS.md 规范手动增量更新本日志。

+- 23:30：我落地了会话上下文保留机制（LLM + 确定性双保险），解决了"用户删除草稿后说'再提交'丢失上下文"的问题。
+  - Git 提交检查：`git fetch --all --prune` 后本地与 origin/main 同步(不 ahead 不 behind)。
+  - 背景：排查确认对话消息和 steward_state 虽已持久化在 DB，但 plan 接口的 `build_plan` 从不读历史 task，且"再提交"被路由到 plan 接口（而非能恢复 task 的 runtime-decision 接口），导致系统无法把"再提交"和之前被拦的出差申请关联起来。
+  - 修改①（LLM 历史关联·保险②）：`steward.py` 新增 `_inject_recent_conversation_history`，在 build_plan 前用 `AgentConversationService.list_message_history(conversation_id, limit=10)` 读出最近 10 条对话，注入 `context_json.recent_history`。`steward_intent_agent.py` 的 `_build_messages` 把 recent_history 暴露为 context_payload 顶层结构化字段，并在 system prompt 加引导："当用户说'再提交''继续''重新提交'等确认类话术时，必须结合 recent_history 里最近一次提到的出差/报销申请来理解"。
+  - 修改②（确定性兜底·保险①）：新建 `steward_context_resume.py`——`should_resume_recent_task` 检测"再提交"类话术（12 个关键词）+ `steward_state.flows` 有可恢复 flow；`resume_task_from_flow` 从 flow.fields 恢复 StewardTask（复用 runtime-decision 的恢复逻辑）；`attach_resumed_task` 把恢复的 task 挂回 plan，planning_source 标记为 `context_resume`。`steward.py` 新增 `_apply_context_resume`，在 build_plan 后、plan 无 task 时触发确定性兜底。两个入口（`/plans` 和 `/plans/stream`）都已接入。
+  - 验证：后端全量测试 **67 passed**（含新增 11 个：context_resume 8 + intent_agent history 3）；端到端验证两轮对话——"上海出差火车"→"再提交"，LLM 历史关联成功恢复 expense_application task（fields 完整）；纯函数验证确定性兜底在模型返回空 task 时从 state 恢复（planning_source=context_resume）。
+  - 影响：会话上下文保留到用户清理会话；行为处理只看最近 10 条，超长会话不爆 token；"再提交"类话术现在能恢复之前被拦的申请 task。正常 plan 产生的 task 已通过 `merge_plan` 写进 `steward_state.flows`，重复检查不改 state，所以 task 在 state 里一直存活到会话结束。
+
+## 遗留问题（补充）
+
+- 23:30：历史条数固定为 10，未做 token 感知裁剪；极端情况下单条消息很长（如粘贴大段文本）可能导致 token 超限，但实测正常对话不会触发。
+
+- 00:10：我完成了统一门控管道的架构设计文档，作为后续重构的唯一事实来源。
+  - 文档路径：`document/development/AI意图规划器/UNIFIED_GATE_PIPELINE.md`
+  - 核心判断：当前门控散落在 7 处（前端 7 层 if/else + 后端 endpoint 4 个补丁 + 图条件边 + off_topic 关键词 + 候选流程判定），每加一个场景成本 O(n)，漏一处静默出错。这是"不持久"的根因。
+  - 目标架构：LangGraph 图成为唯一编排者（load_context → gate_classify → route 分支 → attach_action_steps → persist_state），endpoint 退化为 3 行纯 IO，前端退化为纯渲染（fetchStewardPlan → renderPlanResponse）。
+  - 接入成本 O(1) 的硬验收标准：加场景只需新建 1 个 SceneDescriptor + 1 个 handler 函数 + 注册，不动图/endpoint/前端/extraction。
+  - 迁移分 3 阶段：Phase 1 后端收口（建 scenes 注册表 + endpoint 补丁搬进图节点）、Phase 2 前端退化纯渲染（移除 7 层 if/else）、Phase 3 清理冗余。
+  - Git 提交检查：本地与 origin/main 同步。
+
+- 00:50：我完成了统一门控管道 Phase 1 的 scene 注册表骨架（P1.1-P1.2），作为后端收口的基础设施。
+  - Git 提交检查：本地与 origin/main 同步。
+  - 修改：新建 `server/src/app/services/scenes/` 目录——`gate_rules.py`（GateRule/SceneRoute 枚举）、`scene_descriptor.py`（SceneDescriptor dataclass，声明 scene_id/label/signal_keywords/ontology_fields/gate/route/handler/can_resume/flow_id/prompt_fragment/priority 等）、`scene_registry.py`（SceneRegistry 单例 + 查询方法）、3 个场景文件（expense_application/reimbursement/query_travel_standard）、`__init__.py`（bootstrap + 运行时绑定 handler/builder/executor）。
+  - 验证：冒烟测试 3 个场景注册成功、优先级排序正确（query 在前,priority=50）、35 个 signal_keywords 聚合、handler/builder/executor 运行时绑定成功、无循环 import；后端全量 76 passed，scene 注册表的加入未破坏任何现有代码。
+  - 影响：为后续图拓扑重构（P1.3-P1.8）提供了声明式场景注册基础设施。当前 scene_registry 与现有 intent_registry 并存，后续 P1.3-P1.7 会把 intent_registry 的消费者逐步迁移到 scene_registry。
+  - 下一步：P1.3-P1.8 图拓扑重构（新增 load_context/gate_classify/resume/persist 节点、endpoint 退化、registry 消费者迁移）。
+
 ## TODO

 - [ ] 为 `quick_validate.py` 准备稳定运行环境，避免后续新增 Skill 时继续依赖人工兜底。（来源：09:18 技能校验）