286 lines
7.2 KiB
Markdown
286 lines
7.2 KiB
Markdown
|
|
# Phase 4:可视化与隔离执行阶段(Visibility + Isolation)
|
|||
|
|
|
|||
|
|
日期:2026-04-03
|
|||
|
|
状态:Day 4-5 最小闭环已完成(后端可见性 API + runtime summary + Agents 页面首屏接入 + 隔离设计)
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 1. 阶段目标
|
|||
|
|
|
|||
|
|
把 Jarvis 的多 agent 系统从“内部能跑”升级为“可看、可查、可调试、可规划隔离执行”的系统。
|
|||
|
|
|
|||
|
|
本阶段分两部分:
|
|||
|
|
|
|||
|
|
- **可见性**:把 runtime state 中已经存在的协作数据稳定对外暴露
|
|||
|
|
- **隔离执行设计**:先明确最小可落地方案,不在 Day 4 内强行做完整实现
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 2. Day 4 已落地范围
|
|||
|
|
|
|||
|
|
### 2.1 可见性数据源
|
|||
|
|
|
|||
|
|
Day 4 已明确:可见性 API 的单一数据源是 **conversation continuity snapshot 中保存的 runtime state**,而不是数据库推断或日志二次解析。
|
|||
|
|
|
|||
|
|
当前已作为读取基础的数据包括:
|
|||
|
|
|
|||
|
|
- `event_trace`:关键生命周期事件
|
|||
|
|
- `message_trace`:thread / message 流向
|
|||
|
|
- `active_tasks`:当前任务清单
|
|||
|
|
- `task_results`:任务执行结果
|
|||
|
|
- `task_hierarchy`:父子任务关系
|
|||
|
|
- `verification_status`
|
|||
|
|
- `verification_summary`
|
|||
|
|
- `verification_evidence`
|
|||
|
|
- `thread_id`
|
|||
|
|
- `agent_id`
|
|||
|
|
- `root_agent_id`
|
|||
|
|
- `spawned_agent_ids`
|
|||
|
|
- `tool_outcomes`
|
|||
|
|
|
|||
|
|
### 2.2 已实现的只读 API
|
|||
|
|
|
|||
|
|
当前已经在 `backend/app/routers/agent.py` 下提供以下只读接口:
|
|||
|
|
|
|||
|
|
- `GET /api/agents/visibility/events`
|
|||
|
|
- 支持 `conversation_id`、`agent_id`、`thread_id`、`event_type`、时间范围、分页过滤
|
|||
|
|
- `GET /api/agents/visibility/topology`
|
|||
|
|
- 返回当前协作拓扑、节点、边、任务摘要、task hierarchy
|
|||
|
|
- `GET /api/agents/visibility/tasks/{task_id}/evidence`
|
|||
|
|
- 返回 task、task result、关联 tool outcomes、verifier 结果
|
|||
|
|
- `GET /api/agents/visibility/threads/{thread_id}/messages`
|
|||
|
|
- 返回指定 thread 的消息历史
|
|||
|
|
- `GET /api/agents/visibility/verifier`
|
|||
|
|
- 返回当前会话 verifier 状态、摘要、证据
|
|||
|
|
- `GET /api/agents/visibility/runtime-summary`
|
|||
|
|
- 返回 execution mode、phase/checkpoint、verifier、isolation、cost、recent events 聚合摘要
|
|||
|
|
|
|||
|
|
### 2.3 已补测试
|
|||
|
|
|
|||
|
|
Day 4 已新增后端 API 测试文件:
|
|||
|
|
|
|||
|
|
- `backend/tests/backend/app/agents/test_visibility_api.py`
|
|||
|
|
|
|||
|
|
覆盖场景包括:
|
|||
|
|
|
|||
|
|
- event filter + pagination
|
|||
|
|
- topology 构建
|
|||
|
|
- task evidence 查询
|
|||
|
|
- thread message 重建
|
|||
|
|
- verifier 查询
|
|||
|
|
- 非法 datetime 参数校验
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 3. 当前实现边界
|
|||
|
|
|
|||
|
|
Day 4 **已经完成的是后端可见性最小闭环**,但有意不做以下内容:
|
|||
|
|
|
|||
|
|
- 不做前端调试面板 UI
|
|||
|
|
- 不做 SSE / WebSocket 实时推送
|
|||
|
|
- 不做独立新的 visibility 存储层
|
|||
|
|
- 不做完整 worktree / sandbox 执行实现
|
|||
|
|
- 不做自由蜂群式调度
|
|||
|
|
|
|||
|
|
因此 Day 4 的定位应是:
|
|||
|
|
|
|||
|
|
> **已具备可见性查询 API 与隔离执行设计,不等于已经具备完整实时调试平台或完整隔离执行运行时。**
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 4. 隔离执行最小方案
|
|||
|
|
|
|||
|
|
## 4.1 设计目标
|
|||
|
|
|
|||
|
|
对于更复杂、可能污染主工作目录或需要更强安全边界的任务,Day 4 先定义一个清晰的隔离分层策略,供后续 Phase 4+ 实现使用。
|
|||
|
|
|
|||
|
|
### 4.2 隔离级别
|
|||
|
|
|
|||
|
|
| 级别 | 名称 | 适用场景 | Day 4 状态 |
|
|||
|
|
|------|------|----------|------------|
|
|||
|
|
| L0 | 无隔离 | 普通问答、轻量检索、只读分析 | 已存在 |
|
|||
|
|
| L1 | Session State 隔离 | 需要隔离上下文/记忆但不改文件 | 设计完成 |
|
|||
|
|
| L2 | Worktree 隔离 | 代码修改、文件写入、需要独立目录 | **推荐主方案** |
|
|||
|
|
| L3 | Sandbox Container | 高风险命令、需要更强 OS 级边界 | 仅保留扩展位 |
|
|||
|
|
|
|||
|
|
### 4.3 技术选型结论
|
|||
|
|
|
|||
|
|
Day 4 的最小技术选型如下:
|
|||
|
|
|
|||
|
|
#### 首选:**Git worktree 隔离**
|
|||
|
|
|
|||
|
|
适用:
|
|||
|
|
|
|||
|
|
- 代码生成
|
|||
|
|
- 批量重构
|
|||
|
|
- 多 agent 并行改文件
|
|||
|
|
- 需要避免污染主工作目录的执行型任务
|
|||
|
|
|
|||
|
|
原因:
|
|||
|
|
|
|||
|
|
- 与当前 git 仓库工作流天然兼容
|
|||
|
|
- 可以复用已有分支 / review / merge 流程
|
|||
|
|
- 隔离成本低于容器
|
|||
|
|
- 更适合作为 Jarvis 当前阶段的最小可落地方案
|
|||
|
|
|
|||
|
|
#### 次选:**Session state 隔离**
|
|||
|
|
|
|||
|
|
适用:
|
|||
|
|
|
|||
|
|
- 多轮复杂分析
|
|||
|
|
- 需要隔离上下文污染
|
|||
|
|
- 只读或低风险任务
|
|||
|
|
|
|||
|
|
原因:
|
|||
|
|
|
|||
|
|
- 实现成本低
|
|||
|
|
- 不依赖文件系统隔离
|
|||
|
|
- 可先于完整 worktree runtime 落地
|
|||
|
|
|
|||
|
|
#### 暂不在 Day 4 实现:**Sandbox container**
|
|||
|
|
|
|||
|
|
适用:
|
|||
|
|
|
|||
|
|
- 高风险 shell 命令
|
|||
|
|
- 潜在破坏性任务
|
|||
|
|
- 需要系统级资源控制
|
|||
|
|
|
|||
|
|
结论:
|
|||
|
|
|
|||
|
|
- Phase 4 不做完整容器运行时
|
|||
|
|
- 仅保留接口和策略位置,后续再做
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 5. Session State 隔离策略
|
|||
|
|
|
|||
|
|
Session state 隔离的最小原则:
|
|||
|
|
|
|||
|
|
1. 每个隔离 worker 拥有独立的 memory / turn context
|
|||
|
|
2. 默认不继承完整 message history,只注入必要共享上下文
|
|||
|
|
3. 输出只回传:
|
|||
|
|
- task result
|
|||
|
|
- evidence
|
|||
|
|
- verifier-ready summary
|
|||
|
|
4. 不把中间临时推理状态直接合并回主 state
|
|||
|
|
|
|||
|
|
建议最小 state 结构:
|
|||
|
|
|
|||
|
|
- `isolation_mode`: `none | session | worktree | sandbox`
|
|||
|
|
- `isolation_id`
|
|||
|
|
- `isolation_parent_conversation_id`
|
|||
|
|
- `isolation_workspace_path`(如适用)
|
|||
|
|
- `isolation_metadata`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 6. Worktree 隔离策略
|
|||
|
|
|
|||
|
|
Worktree 隔离的最小原则:
|
|||
|
|
|
|||
|
|
1. 每个执行型 worker 对应独立 worktree
|
|||
|
|
2. worktree 使用独立 branch 命名
|
|||
|
|
3. 主工作目录不直接写入
|
|||
|
|
4. 结果通过 diff / changed files / task result 回收
|
|||
|
|
5. 清理策略必须可控,避免遗留脏目录
|
|||
|
|
|
|||
|
|
建议目录模式:
|
|||
|
|
|
|||
|
|
- `.worktrees/jarvis/<conversation-or-run-id>/<worker-id>/`
|
|||
|
|
|
|||
|
|
建议 branch 模式:
|
|||
|
|
|
|||
|
|
- `jarvis/<conversation-or-run-id>/<worker-id>`
|
|||
|
|
|
|||
|
|
最小回收物:
|
|||
|
|
|
|||
|
|
- `modified_files`
|
|||
|
|
- `git_diff_summary`
|
|||
|
|
- `task_result`
|
|||
|
|
- `verification_evidence`
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 7. 最小 Isolation Execution API 定义
|
|||
|
|
|
|||
|
|
Day 4 先定义接口边界,不要求马上完整实现。
|
|||
|
|
|
|||
|
|
### 7.1 任务执行请求侧
|
|||
|
|
|
|||
|
|
建议未来 runtime 接口至少支持:
|
|||
|
|
|
|||
|
|
- `isolation_mode`
|
|||
|
|
- `workspace_strategy`
|
|||
|
|
- `allow_merge_back`
|
|||
|
|
- `cleanup_policy`
|
|||
|
|
|
|||
|
|
示意:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"task_id": "task-123",
|
|||
|
|
"goal": "refactor agent router",
|
|||
|
|
"isolation_mode": "worktree",
|
|||
|
|
"workspace_strategy": "ephemeral",
|
|||
|
|
"allow_merge_back": false,
|
|||
|
|
"cleanup_policy": "on_success"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
### 7.2 执行结果侧
|
|||
|
|
|
|||
|
|
建议统一返回:
|
|||
|
|
|
|||
|
|
```json
|
|||
|
|
{
|
|||
|
|
"task_id": "task-123",
|
|||
|
|
"status": "completed",
|
|||
|
|
"isolation": {
|
|||
|
|
"mode": "worktree",
|
|||
|
|
"workspace_path": ".worktrees/jarvis/run-1/worker-2",
|
|||
|
|
"branch": "jarvis/run-1/worker-2",
|
|||
|
|
"cleanup_status": "pending"
|
|||
|
|
},
|
|||
|
|
"evidence": [],
|
|||
|
|
"summary": "Refactor completed"
|
|||
|
|
}
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 8. 验收结论
|
|||
|
|
|
|||
|
|
Day 4 最小闭环完成时,当前应以以下标准为准:
|
|||
|
|
|
|||
|
|
- [x] 可按条件查询 event trace
|
|||
|
|
- [x] 可查询协作拓扑与任务摘要
|
|||
|
|
- [x] 可查询 task 执行证据
|
|||
|
|
- [x] 可重建 thread message 历史
|
|||
|
|
- [x] 可查询 verifier 结果
|
|||
|
|
- [x] 有 Day 4 后端 API 测试覆盖
|
|||
|
|
- [x] 有隔离执行最小设计方案
|
|||
|
|
- [ ] 不包含实时 SSE/UI
|
|||
|
|
- [ ] 不包含完整 worktree/sandbox runtime 实现
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## 9. 本阶段完成后的真实状态
|
|||
|
|
|
|||
|
|
完成 Day 4 后,Jarvis 当前具备:
|
|||
|
|
|
|||
|
|
- 受限多 agent runtime 的可见性查询能力
|
|||
|
|
- 面向调试与验收的后端只读 API
|
|||
|
|
- 基于 continuity snapshot 的稳定可见性数据源
|
|||
|
|
- 面向后续实现的 isolation strategy 设计
|
|||
|
|
|
|||
|
|
当前 **尚未具备**:
|
|||
|
|
|
|||
|
|
- 实时事件推送平台
|
|||
|
|
- 前端可视化调试面板
|
|||
|
|
- 完整 worktree 执行编排
|
|||
|
|
- 容器级 sandbox 执行器
|
|||
|
|
|
|||
|
|
这意味着:
|
|||
|
|
|
|||
|
|
> Day 4 已完成最小闭环,但后续仍可继续扩展为完整可视化 UI 和隔离执行 runtime。
|