43 lines
2.2 KiB
Markdown
43 lines
2.2 KiB
Markdown
|
|
# Notes: Jarvis Knowledge Brain Blueprint
|
||
|
|
|
||
|
|
## Current-State Findings
|
||
|
|
- Existing source domains already exist separately: conversations, documents, todos, tasks, forum posts.
|
||
|
|
- Current long-term memory only comes from conversation extraction via `UserMemory`.
|
||
|
|
- Current graph build path only uses indexed document chunks.
|
||
|
|
- Scheduler infrastructure already exists and can host daily brain-learning jobs.
|
||
|
|
- Frontend already exposes a `知识大脑` navigation entry, but it currently points to the graph page.
|
||
|
|
|
||
|
|
## Synthesized Findings
|
||
|
|
|
||
|
|
### What can be reused
|
||
|
|
- `memory_service` as a seed for conversation extraction and recall.
|
||
|
|
- `scheduler_service` as the base for daily learning workflows.
|
||
|
|
- `tag_service` as an early foundation for brain tags.
|
||
|
|
- Existing business tables as authoritative raw source records.
|
||
|
|
|
||
|
|
### What is missing
|
||
|
|
- Unified event layer across all source systems.
|
||
|
|
- Candidate memory layer between raw events and durable brain memory.
|
||
|
|
- Timeline-aware memory model with reinforcement / archival states.
|
||
|
|
- Retrieval path that combines long-term memory with recent relevant events.
|
||
|
|
- Brain-specific APIs and a dedicated frontend dashboard module.
|
||
|
|
|
||
|
|
### Phase 1 objective
|
||
|
|
- Build the minimum architecture needed for a real event-driven brain:
|
||
|
|
- BrainEvent
|
||
|
|
- BrainCandidate
|
||
|
|
- BrainMemory
|
||
|
|
- BrainTag and link tables
|
||
|
|
- ingestion services
|
||
|
|
- daily learning job
|
||
|
|
- retrieval integration
|
||
|
|
- brain dashboard APIs
|
||
|
|
|
||
|
|
## Additional Findings: Knowledge Parsing Normalization
|
||
|
|
- Current document ingestion parses each format separately and builds chunks directly from ParsedNode items.
|
||
|
|
- Current chunks already carry structural metadata, but there is no explicit parent-child chunk graph.
|
||
|
|
- The agreed direction is to use MinerU for PDF only, keep existing parsers for DOCX/XLSX/CSV/MD/TXT, and converge all outputs into structured markdown.
|
||
|
|
- normalized_content should be persisted on documents so preview, rebuild, and future chunking can reuse the same canonical text.
|
||
|
|
- Lightweight hierarchy should be represented in chunk metadata first, not in a new relational tree schema.
|
||
|
|
- Current DOCX upload failure in the running environment is caused by a missing python-docx installation in the active backend environment.
|