feat(frontend): add memory components, temple/war-room pages, and composables
- Add DailyDigestCard and ReminderToast memory components - Add temple and war-room page routes - Add memory API module with TypeScript definitions - Add chat composables: useClientTime, useDailyDigest, useSidebarPlan - Simplify chat/logs/settings pages (remove unused code) - Add settingsPage.css
This commit is contained in:
228
development-doc/plan/memory-update/phase-m-5-recall-injection.md
Normal file
228
development-doc/plan/memory-update/phase-m-5-recall-injection.md
Normal file
@@ -0,0 +1,228 @@
|
||||
# Phase M.5:记忆召回注入(Memory Recall Injection)
|
||||
|
||||
日期:2026-04-05
|
||||
状态:规划中
|
||||
依赖:M.1 (重要性评分), M.4 (自动提取)
|
||||
工作量:2 天
|
||||
|
||||
---
|
||||
|
||||
## 1. 本阶段目的
|
||||
|
||||
让 Jarvis 在每次对话时**自动**将相关记忆注入到 LLM 的 system prompt,使 AI 真正「记得」用户。
|
||||
|
||||
当前问题:
|
||||
- M.1-M.4 构建和管理了记忆,但 LLM 在生成回答时根本看不到这些记忆
|
||||
- `memory_service.recall_memories()` 虽然存在,但没有在对话路由中被调用
|
||||
- 记忆库有内容,对话却没有个性化——记忆和对话是两个孤立的系统
|
||||
|
||||
---
|
||||
|
||||
## 2. 核心架构
|
||||
|
||||
```
|
||||
用户发来消息
|
||||
│
|
||||
▼
|
||||
MemoryRecallInjector
|
||||
├── retrieve_relevant() # 语义搜索匹配当前消息
|
||||
├── rank_by_importance() # 按 M.1 重要性分数排序
|
||||
├── budget_tokens() # 控制注入 token 数量(上限 800)
|
||||
└── format_context() # 格式化为 system prompt 片段
|
||||
│
|
||||
▼
|
||||
LLM system prompt 中追加 memory context
|
||||
│
|
||||
▼
|
||||
LLM 生成回答(带个人化上下文)
|
||||
│
|
||||
▼
|
||||
触发 M.2 强化(召回的记忆 frequency +1)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 3. 核心实现
|
||||
|
||||
### 3.1 MemoryRecallInjector
|
||||
|
||||
```python
|
||||
class MemoryRecallInjector:
|
||||
async def build_context(
|
||||
self,
|
||||
user_id: str,
|
||||
current_message: str,
|
||||
token_budget: int = 800,
|
||||
) -> str:
|
||||
"""
|
||||
根据当前消息,检索相关记忆并拼装为 system prompt 片段。
|
||||
"""
|
||||
# 1. 语义检索最相关的记忆
|
||||
candidates = await self.memory_service.recall_memories(
|
||||
user_id=user_id,
|
||||
query=current_message,
|
||||
top_k=20,
|
||||
)
|
||||
|
||||
# 2. 过滤已归档记忆(M.2 decay < 0.2 的记忆不注入)
|
||||
active = [m for m in candidates if not m.is_archived]
|
||||
|
||||
# 3. 按重要性评分 + 相关性综合排序
|
||||
ranked = self._rank(active, current_message)
|
||||
|
||||
# 4. Token 预算控制,避免占用过多上下文
|
||||
selected = self._budget_select(ranked, token_budget)
|
||||
|
||||
# 5. 格式化
|
||||
return self._format(selected)
|
||||
|
||||
def _format(self, memories: list[UserMemory]) -> str:
|
||||
if not memories:
|
||||
return ""
|
||||
lines = ["[关于你的记忆]"]
|
||||
for m in memories:
|
||||
lines.append(f"- {m.content}")
|
||||
return "\n".join(lines)
|
||||
```
|
||||
|
||||
### 3.2 注入点:对话路由
|
||||
|
||||
```python
|
||||
# routers/conversation.py
|
||||
|
||||
@router.post("/api/conversations/{conversation_id}/messages")
|
||||
async def send_message(conversation_id: str, body: MessageRequest, ...):
|
||||
# 1. 召回注入
|
||||
memory_context = await memory_injector.build_context(
|
||||
user_id=current_user.id,
|
||||
current_message=body.content,
|
||||
)
|
||||
|
||||
# 2. 拼装 system prompt
|
||||
system_prompt = base_system_prompt
|
||||
if memory_context:
|
||||
system_prompt = f"{system_prompt}\n\n{memory_context}"
|
||||
|
||||
# 3. 发送给 LLM
|
||||
response = await llm.chat(
|
||||
messages=conversation_messages,
|
||||
system=system_prompt,
|
||||
)
|
||||
|
||||
# 4. 触发记忆强化(后台任务,不阻塞)
|
||||
background_tasks.add_task(
|
||||
memory_reinforcement.trigger_by_query,
|
||||
user_id=current_user.id,
|
||||
query=body.content,
|
||||
)
|
||||
|
||||
return response
|
||||
```
|
||||
|
||||
### 3.3 排序逻辑
|
||||
|
||||
```python
|
||||
def _rank(
|
||||
self,
|
||||
memories: list[UserMemory],
|
||||
query: str,
|
||||
) -> list[UserMemory]:
|
||||
"""
|
||||
综合排序:语义相关性 × 重要性评分
|
||||
- 重要性分数来自 M.1 ImportanceScorer
|
||||
- 相关性分数来自向量距离(mem0 已计算)
|
||||
"""
|
||||
def score(m: UserMemory) -> float:
|
||||
relevance = m.similarity_score or 0.5 # 来自召回时的余弦相似度
|
||||
importance = m.importance_score # 来自 M.1
|
||||
recency_boost = 1.0 if m.memory_type in ("goal", "pain_point") else 0.8
|
||||
return relevance * 0.6 + importance * 0.4 * recency_boost
|
||||
|
||||
return sorted(memories, key=score, reverse=True)
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## 4. Token 预算控制
|
||||
|
||||
记忆注入不能无限制增长,否则会挤压对话本身的上下文空间。
|
||||
|
||||
```python
|
||||
def _budget_select(
|
||||
self,
|
||||
memories: list[UserMemory],
|
||||
token_budget: int,
|
||||
) -> list[UserMemory]:
|
||||
"""
|
||||
贪心选择:按排名依次选入,直到 token 预算耗尽。
|
||||
粗略估算:1 条记忆 ≈ 30 token
|
||||
"""
|
||||
selected = []
|
||||
used = 20 # "[关于你的记忆]\n" 的固定开销
|
||||
for m in memories:
|
||||
cost = len(m.content) // 2 + 10 # 粗略估算
|
||||
if used + cost > token_budget:
|
||||
break
|
||||
selected.append(m)
|
||||
used += cost
|
||||
return selected
|
||||
```
|
||||
|
||||
**默认预算:800 token**(约 26 条记忆),可在 config 中调整。
|
||||
|
||||
---
|
||||
|
||||
## 5. 记忆类型优先级
|
||||
|
||||
不同类型的记忆注入优先级不同:
|
||||
|
||||
| 类型 | 优先级 | 说明 |
|
||||
|------|--------|------|
|
||||
| `pain_point` | 最高 | 反复困扰用户的问题,每次都应提醒 |
|
||||
| `goal` | 高 | 用户的目标,影响回答方向 |
|
||||
| `preference` | 中 | 影响回答风格(简短、代码优先等)|
|
||||
| `fact` | 中 | 基础事实(职业、地点、技术栈)|
|
||||
| `event` | 低 | 今日事件,时效性强,过期降权 |
|
||||
|
||||
---
|
||||
|
||||
## 6. 核心文件
|
||||
|
||||
### 6.1 新增文件
|
||||
|
||||
| 文件 | 职责 |
|
||||
|------|------|
|
||||
| `services/memory/recall_injector.py` | 记忆召回与注入 |
|
||||
| `tests/services/test_recall_injector.py` | 注入测试 |
|
||||
|
||||
### 6.2 修改文件
|
||||
|
||||
| 文件 | 修改内容 |
|
||||
|------|---------|
|
||||
| `routers/conversation.py` | 发送消息前注入记忆 context |
|
||||
| `services/memory_service.py` | recall_memories() 返回相似度分数 |
|
||||
|
||||
---
|
||||
|
||||
## 7. 验收标准
|
||||
|
||||
| 标准 | 说明 |
|
||||
|------|------|
|
||||
| 注入生效 | LLM 回答中能体现用户个人信息 |
|
||||
| 不超 token 预算 | 注入内容 ≤ 800 token |
|
||||
| 高优先级优先 | goal/pain_point 比 fact 更早注入 |
|
||||
| 已归档不注入 | decay < 0.2 的记忆不出现在 context 中 |
|
||||
| 不阻塞响应 | 注入耗时 < 100ms(内存/向量检索) |
|
||||
| 强化触发 | 被召回的记忆 frequency_count +1 |
|
||||
|
||||
---
|
||||
|
||||
## 8. 工作量估算
|
||||
|
||||
| 任务 | 工作量 |
|
||||
|------|--------|
|
||||
| MemoryRecallInjector 实现 | 0.5 天 |
|
||||
| 对话路由集成 | 0.5 天 |
|
||||
| Token 预算 + 排序调优 | 0.5 天 |
|
||||
| 测试 | 0.5 天 |
|
||||
| **合计** | **2 天** |
|
||||
Reference in New Issue
Block a user