# Notes: Jarvis Knowledge Brain Blueprint ## Current-State Findings - Existing source domains already exist separately: conversations, documents, todos, tasks, forum posts. - Current long-term memory only comes from conversation extraction via `UserMemory`. - Current graph build path only uses indexed document chunks. - Scheduler infrastructure already exists and can host daily brain-learning jobs. - Frontend already exposes a `知识大脑` navigation entry, but it currently points to the graph page. ## Synthesized Findings ### What can be reused - `memory_service` as a seed for conversation extraction and recall. - `scheduler_service` as the base for daily learning workflows. - `tag_service` as an early foundation for brain tags. - Existing business tables as authoritative raw source records. ### What is missing - Unified event layer across all source systems. - Candidate memory layer between raw events and durable brain memory. - Timeline-aware memory model with reinforcement / archival states. - Retrieval path that combines long-term memory with recent relevant events. - Brain-specific APIs and a dedicated frontend dashboard module. ### Phase 1 objective - Build the minimum architecture needed for a real event-driven brain: - BrainEvent - BrainCandidate - BrainMemory - BrainTag and link tables - ingestion services - daily learning job - retrieval integration - brain dashboard APIs ## Additional Findings: Knowledge Parsing Normalization - Current document ingestion parses each format separately and builds chunks directly from ParsedNode items. - Current chunks already carry structural metadata, but there is no explicit parent-child chunk graph. - The agreed direction is to use MinerU for PDF only, keep existing parsers for DOCX/XLSX/CSV/MD/TXT, and converge all outputs into structured markdown. - normalized_content should be persisted on documents so preview, rebuild, and future chunking can reuse the same canonical text. - Lightweight hierarchy should be represented in chunk metadata first, not in a new relational tree schema. - Current DOCX upload failure in the running environment is caused by a missing python-docx installation in the active backend environment.