Files
JARVIS/development-doc/plan/hermes-update/phase-h3-durable-session-lifecycle.md

72 lines
1.8 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# H3 Durable Session Lifecycle
## 1. 目标
`HermesSessionManager` 从“进程内 session 缓存”升级成支持恢复、重建、观测的 durable lifecycle manager。
## 2. 当前问题
当前 `backend/app/services/agent_runtime/hermes_session_manager.py` 已有:
- conversation -> session 基础映射
- per-conversation lock
- last_used / restart_count / metadata
但它仍然偏原型:
- 依赖当前进程内内存
- 后端重启后的恢复能力不足
- warm / resumed / cold 没有显式状态
- recovery policy 不够清晰
## 3. 生命周期目标
```text
message arrives
-> lookup by conversation
-> warm session exists? reuse
-> else hydrate from agent_state
-> if hydrate success => resumed
-> else create fresh => cold
-> execute turn
-> update state/metrics
-> idle reclaim if needed
```
## 4. 必要能力
1. warm / resumed / cold 状态区分
2. conversation 级别锁
3. runtime health 检查
4. restart / recreate 策略
5. idle reclaim
6. safe rehydrate
7. stale session 检测
8. error 状态记录
## 5. 与 envelope 的关系
持久化来源:
- `Conversation.agent_state.runtime_state.hermes`
运行态来源:
- `HermesSessionManager`
原则:
- warm session 提升性能
- durable metadata 保障可恢复性
- 不能要求一个 Hermes 进程永远不死
## 6. 推荐文件变更
- `backend/app/services/agent_runtime/hermes_session_manager.py`
- `backend/app/services/agent_runtime/hermes_runtime.py`
- `backend/app/services/agent_service.py`
- `backend/app/models/conversation.py`
- 新增或补充测试session resume / recreate / restart / idle reclaim
## 7. 完成标准
- [ ] conversation 能恢复到正确 Hermes session 或重建新 session
- [ ] warm / resumed / cold 状态可区分
- [ ] 后端重启后 continuity 不直接断裂
- [ ] recovery/failure 有清晰记录