72 lines
1.8 KiB
Markdown
72 lines
1.8 KiB
Markdown
|
|
# H3 Durable Session Lifecycle
|
|||
|
|
|
|||
|
|
## 1. 目标
|
|||
|
|
|
|||
|
|
把 `HermesSessionManager` 从“进程内 session 缓存”升级成支持恢复、重建、观测的 durable lifecycle manager。
|
|||
|
|
|
|||
|
|
## 2. 当前问题
|
|||
|
|
|
|||
|
|
当前 `backend/app/services/agent_runtime/hermes_session_manager.py` 已有:
|
|||
|
|
- conversation -> session 基础映射
|
|||
|
|
- per-conversation lock
|
|||
|
|
- last_used / restart_count / metadata
|
|||
|
|
|
|||
|
|
但它仍然偏原型:
|
|||
|
|
- 依赖当前进程内内存
|
|||
|
|
- 后端重启后的恢复能力不足
|
|||
|
|
- warm / resumed / cold 没有显式状态
|
|||
|
|
- recovery policy 不够清晰
|
|||
|
|
|
|||
|
|
## 3. 生命周期目标
|
|||
|
|
|
|||
|
|
```text
|
|||
|
|
message arrives
|
|||
|
|
-> lookup by conversation
|
|||
|
|
-> warm session exists? reuse
|
|||
|
|
-> else hydrate from agent_state
|
|||
|
|
-> if hydrate success => resumed
|
|||
|
|
-> else create fresh => cold
|
|||
|
|
-> execute turn
|
|||
|
|
-> update state/metrics
|
|||
|
|
-> idle reclaim if needed
|
|||
|
|
```
|
|||
|
|
|
|||
|
|
## 4. 必要能力
|
|||
|
|
|
|||
|
|
1. warm / resumed / cold 状态区分
|
|||
|
|
2. conversation 级别锁
|
|||
|
|
3. runtime health 检查
|
|||
|
|
4. restart / recreate 策略
|
|||
|
|
5. idle reclaim
|
|||
|
|
6. safe rehydrate
|
|||
|
|
7. stale session 检测
|
|||
|
|
8. error 状态记录
|
|||
|
|
|
|||
|
|
## 5. 与 envelope 的关系
|
|||
|
|
|
|||
|
|
持久化来源:
|
|||
|
|
- `Conversation.agent_state.runtime_state.hermes`
|
|||
|
|
|
|||
|
|
运行态来源:
|
|||
|
|
- `HermesSessionManager`
|
|||
|
|
|
|||
|
|
原则:
|
|||
|
|
- warm session 提升性能
|
|||
|
|
- durable metadata 保障可恢复性
|
|||
|
|
- 不能要求一个 Hermes 进程永远不死
|
|||
|
|
|
|||
|
|
## 6. 推荐文件变更
|
|||
|
|
|
|||
|
|
- `backend/app/services/agent_runtime/hermes_session_manager.py`
|
|||
|
|
- `backend/app/services/agent_runtime/hermes_runtime.py`
|
|||
|
|
- `backend/app/services/agent_service.py`
|
|||
|
|
- `backend/app/models/conversation.py`
|
|||
|
|
- 新增或补充测试:session resume / recreate / restart / idle reclaim
|
|||
|
|
|
|||
|
|
## 7. 完成标准
|
|||
|
|
|
|||
|
|
- [ ] conversation 能恢复到正确 Hermes session 或重建新 session
|
|||
|
|
- [ ] warm / resumed / cold 状态可区分
|
|||
|
|
- [ ] 后端重启后 continuity 不直接断裂
|
|||
|
|
- [ ] recovery/failure 有清晰记录
|