feat: enhance agent orchestration, knowledge flow and UI refinements
This commit is contained in:
561
docs/superpowers/specs/2026-03-25-schedule-planner-design.md
Normal file
561
docs/superpowers/specs/2026-03-25-schedule-planner-design.md
Normal file
@@ -0,0 +1,561 @@
|
||||
# Schedule Planner Agent Redesign
|
||||
|
||||
## Goal
|
||||
|
||||
Replace the current planner role with a schedule-focused planning system that analyzes conversation history, the task board, and forum signals to produce actionable scheduling recommendations for the user.
|
||||
|
||||
## Scope
|
||||
|
||||
This redesign covers both the main planner role and its subagents across backend orchestration, prompts, routing, scheduled execution, todo generation, frontend presentation, and related tests.
|
||||
|
||||
## User-Approved Direction
|
||||
|
||||
- Replace the current path-planning semantics with schedule-planning semantics.
|
||||
- Redesign both the main planner role and its subagents.
|
||||
- Inputs for planning:
|
||||
- conversation history
|
||||
- task board
|
||||
- forum information
|
||||
- Output style:
|
||||
- conclusion first
|
||||
- executable schedule next
|
||||
- Trigger modes:
|
||||
- when the user explicitly asks for scheduling advice
|
||||
- at a fixed daily time
|
||||
- Daily scheduled analysis should write actionable suggestions into todo items.
|
||||
|
||||
## Architecture
|
||||
|
||||
### Main Role
|
||||
|
||||
The current `planner` role will be replaced at the system level by a new role id:
|
||||
|
||||
- `schedule_planner`
|
||||
|
||||
Its responsibility is no longer “find the shortest execution path for a goal.” Instead, it becomes the scheduling brain that:
|
||||
|
||||
1. understands current commitments and pressure signals
|
||||
2. evaluates urgency, importance, dependency, and timing
|
||||
3. recommends near-term scheduling actions
|
||||
4. converts useful scheduled guidance into concrete todo items when triggered by the daily scheduler
|
||||
|
||||
### Subagents
|
||||
|
||||
The existing planner subagent structure will be redesigned into two schedule-specific subagents:
|
||||
|
||||
- `schedule_analysis`
|
||||
- analyzes conversation history, task board state, and forum signals
|
||||
- identifies priorities, pressure points, conflicts, dependencies, risks, and things that can be delayed
|
||||
|
||||
- `schedule_planning`
|
||||
- converts analysis into an execution-oriented schedule recommendation
|
||||
- outputs conclusion first, then a practical schedule proposal
|
||||
- when running from the daily scheduled workflow, produces todo-ready action items
|
||||
|
||||
### Trigger Paths
|
||||
|
||||
#### Interactive Trigger
|
||||
|
||||
When the user asks questions such as:
|
||||
|
||||
- what should I do today
|
||||
- how should I arrange this week
|
||||
- based on my recent work, what should I focus on next
|
||||
- help me schedule upcoming work
|
||||
|
||||
The master agent should route to `schedule_planner`.
|
||||
|
||||
The expected response shape:
|
||||
|
||||
1. current conclusion
|
||||
2. today / near-term schedule recommendation
|
||||
3. next actions
|
||||
|
||||
#### Daily Scheduled Trigger
|
||||
|
||||
A daily scheduled job invokes the schedule planner flow automatically.
|
||||
|
||||
The daily run should:
|
||||
|
||||
1. collect relevant context from conversation history, tasks, and forum data
|
||||
2. run `schedule_analysis`
|
||||
3. run `schedule_planning`
|
||||
4. convert only actionable, non-duplicate recommendations into todo items
|
||||
|
||||
The daily run should not dump raw analysis into todos. Only concise, action-worthy, user-meaningful recommendations become todos.
|
||||
|
||||
## Data Flow
|
||||
|
||||
### Inputs
|
||||
|
||||
The schedule planning system should read from three sources:
|
||||
|
||||
1. **Conversation history**
|
||||
- recent user intent
|
||||
- commitments implied in prior discussion
|
||||
- stated priorities, urgency, and unresolved threads
|
||||
|
||||
2. **Task board**
|
||||
- open items
|
||||
- current statuses
|
||||
- stalled work
|
||||
- high-priority or overdue work
|
||||
|
||||
3. **Forum information**
|
||||
- new items requiring attention
|
||||
- external pressure or discussion signals
|
||||
- updates that may change priority
|
||||
|
||||
### Internal Processing
|
||||
|
||||
The main flow should be:
|
||||
|
||||
- Master decides scheduling intent
|
||||
- `schedule_planner` receives context
|
||||
- `schedule_analysis` identifies priority structure
|
||||
- `schedule_planning` produces human-usable output
|
||||
- scheduled mode additionally writes selected suggestions into todos
|
||||
|
||||
### Outputs
|
||||
|
||||
#### Interactive Output
|
||||
|
||||
The default answer structure should be:
|
||||
|
||||
- conclusion first
|
||||
- suggested schedule second
|
||||
- next actions last
|
||||
|
||||
#### Scheduled Output
|
||||
|
||||
The scheduled run should create todo entries with:
|
||||
|
||||
- concise action phrasing
|
||||
- enough context to be actionable
|
||||
- source attribution where useful (conversation/task/forum)
|
||||
- duplicate avoidance
|
||||
|
||||
## Migration Strategy
|
||||
|
||||
This redesign uses a two-phase migration to avoid breaking stored state and UI rendering.
|
||||
|
||||
### Phase 1: Compatibility Window
|
||||
|
||||
- accept legacy `planner` values from stored traces, mock payloads, and historical records
|
||||
- normalize legacy `planner` to `schedule_planner` at read boundaries where practical
|
||||
- accept legacy `planner_scope` and `planner_steps` as read-only legacy values and normalize them to `schedule_analysis` and `schedule_planning`
|
||||
- write only the new ids going forward:
|
||||
- `schedule_planner`
|
||||
- `schedule_analysis`
|
||||
- `schedule_planning`
|
||||
|
||||
### Phase 2: Legacy Removal
|
||||
|
||||
After the migration is complete and all active UI payloads, mock data, and tests are updated:
|
||||
|
||||
- remove legacy id acceptance from orchestration and frontend display logic
|
||||
- remove legacy mock fixtures
|
||||
- keep migration code out of prompts and core scheduling behavior
|
||||
|
||||
### Migration Scope
|
||||
|
||||
The migration must cover:
|
||||
|
||||
- backend enums and routing
|
||||
- frontend agent ids and telemetry labels
|
||||
- stored trace rendering paths
|
||||
- mock data used by agent dashboards and chat orchestration views
|
||||
- tests that still refer to `planner`, `planner_scope`, or `planner_steps`
|
||||
|
||||
## Input Contracts
|
||||
|
||||
The schedule planning system reads from three sources with explicit limits.
|
||||
|
||||
### Conversation History Contract
|
||||
|
||||
- use recent conversation history from the current user context
|
||||
- default retrieval window: last 7 days of relevant conversation turns, capped at the latest 50 turns
|
||||
- prefer turns that include commitments, priorities, deadlines, blockers, or future-oriented intent
|
||||
- if conversation history is unavailable, continue with degraded confidence
|
||||
|
||||
### Task Board Contract
|
||||
|
||||
- include open, in-progress, blocked, overdue, and high-priority tasks
|
||||
- exclude completed and archived items by default
|
||||
- include enough task metadata to reason about urgency and dependency:
|
||||
- title
|
||||
- status
|
||||
- priority
|
||||
- due date if present
|
||||
- last updated time if present
|
||||
- if task data is unavailable, continue with degraded confidence
|
||||
|
||||
### Forum Information Contract
|
||||
|
||||
- include recent forum items that may affect user priorities
|
||||
- default retrieval window: last 7 days of relevant forum signals
|
||||
- forum signals may include:
|
||||
- new posts requiring attention
|
||||
- replies or escalations
|
||||
- updates that change urgency or expected follow-up
|
||||
- if forum data is unavailable, continue with degraded confidence
|
||||
|
||||
## Output Contracts
|
||||
|
||||
### `schedule_analysis` Output Schema
|
||||
|
||||
The analysis stage should produce a structured summary with these fields:
|
||||
|
||||
- `top_priorities`: list of current highest-priority focus areas
|
||||
- `risks`: list of risk or pressure signals
|
||||
- `conflicts`: list of timing or dependency conflicts
|
||||
- `deferrable_items`: list of lower-priority items that can be delayed
|
||||
- `evidence`: source references grouped by `conversation`, `task_board`, or `forum`
|
||||
- `confidence`: one of `high`, `medium`, `low`
|
||||
|
||||
### `schedule_planning` Output Schema
|
||||
|
||||
The planning stage should produce a structured recommendation with these fields:
|
||||
|
||||
- `conclusion`: short decision-oriented summary
|
||||
- `today_plan`: list of suggested actions for the current day or immediate next window
|
||||
- `near_term_plan`: list of actions for the next few days or current week
|
||||
- `next_actions`: short ordered action list
|
||||
- `todo_candidates`: only present in scheduled mode; candidate todo items derived from the recommendation
|
||||
- `confidence`: one of `high`, `medium`, `low`
|
||||
|
||||
### `todo_candidates` Schema
|
||||
|
||||
Each `todo_candidate` must use this structure:
|
||||
|
||||
- `title`: required short action text
|
||||
- `description`: required short rationale grounded in source context
|
||||
- `sources`: required list of provenance objects
|
||||
- `priority`: optional normalized priority such as `high`, `medium`, `low`
|
||||
- `target_window`: optional string such as `today` or `this_week`
|
||||
|
||||
Each provenance object in `sources` must contain:
|
||||
|
||||
- `type`: one of `conversation`, `task_board`, `forum`
|
||||
- `id`: source object id when available, otherwise a stable synthetic reference
|
||||
- `label`: short human-readable source label
|
||||
|
||||
### Evidence Structure
|
||||
|
||||
Each item in `schedule_analysis.evidence` must contain:
|
||||
|
||||
- `type`: one of `conversation`, `task_board`, `forum`
|
||||
- `id`: source object id when available, otherwise a stable synthetic reference
|
||||
- `label`: short human-readable identifier
|
||||
- `reason`: brief explanation of why the signal matters to scheduling
|
||||
|
||||
### Interactive Response Contract
|
||||
|
||||
The user-facing answer should always follow this shape:
|
||||
|
||||
1. conclusion
|
||||
2. suggested schedule
|
||||
3. next actions
|
||||
|
||||
If confidence is low, the response must say that explicitly and avoid overconfident scheduling language.
|
||||
|
||||
## Daily Scheduler Contract
|
||||
|
||||
The daily scheduled trigger must follow explicit execution semantics.
|
||||
|
||||
### Execution Model
|
||||
|
||||
- run once per user per local date
|
||||
- default execution time: 07:00 in the user's configured timezone
|
||||
- if the user has no configured timezone, skip the run and log the skip reason
|
||||
- do not automatically backfill missed runs
|
||||
- enforce idempotency by `(user_id, local_date, job_type)` so the same daily analysis is not executed more than once successfully
|
||||
|
||||
### Scheduled Mode Behavior
|
||||
|
||||
A successful scheduled run should:
|
||||
|
||||
1. gather available context from the three input sources
|
||||
2. execute `schedule_analysis`
|
||||
3. execute `schedule_planning`
|
||||
4. create todo items from selected `todo_candidates`
|
||||
5. store run telemetry and outcome metadata
|
||||
|
||||
If one or more sources are missing, continue when there is still enough evidence to produce a useful recommendation and mark confidence as reduced.
|
||||
|
||||
Signal evaluation rules:
|
||||
|
||||
- a **strong source** is a source with enough current evidence to support prioritization on its own, such as multiple open high-priority tasks or a recent forum escalation
|
||||
- a **meaningful signal** is a discrete scheduling-relevant item extracted from any source, such as an overdue task, a stated commitment in conversation history, or a forum escalation
|
||||
- the planner may still run with one strong source
|
||||
- scheduled mode may create todos only when at least two meaningful signals exist across all inputs
|
||||
|
||||
If fewer than two meaningful signals are available across all sources, the scheduler should not create todos and should log a low-context outcome.
|
||||
|
||||
Delayed execution rule:
|
||||
|
||||
- if the 07:00 run is delayed by temporary outage or worker unavailability, the system may still execute one delayed run later on the same user-local date
|
||||
- if the entire local date passes without a successful run, do not backfill on the next day
|
||||
|
||||
## Todo Creation Rules
|
||||
|
||||
Todo creation is the main scheduled side effect and must be tightly constrained.
|
||||
|
||||
### Creation Rules
|
||||
|
||||
- create at most 3 todo items per daily run
|
||||
- only create todos for actions that are concrete, near-term, and user-actionable
|
||||
- do not create todos for vague advice, reflections, or duplicated reminders
|
||||
- store source provenance when available:
|
||||
- `conversation`
|
||||
- `task_board`
|
||||
- `forum`
|
||||
|
||||
### Duplicate Detection
|
||||
|
||||
A candidate todo is considered a duplicate if there is already an open todo that matches all of the following:
|
||||
|
||||
- same normalized action text
|
||||
- same source category or same source object when available
|
||||
- created within the last 7 days
|
||||
|
||||
Normalization rules for action text:
|
||||
|
||||
- trim surrounding whitespace
|
||||
- collapse repeated internal whitespace to a single space
|
||||
- lowercase Latin characters
|
||||
- remove trailing full stop / period punctuation only
|
||||
|
||||
Source comparison rules:
|
||||
|
||||
- if a provenance object includes a stable source `id`, compare by `(type, id)`
|
||||
- if no stable source id exists, compare by `(type, normalized label)`
|
||||
- if multiple sources support one recommendation, compare against the highest-priority provenance in this order: `task_board`, `forum`, `conversation`
|
||||
|
||||
When a duplicate is detected:
|
||||
|
||||
- do not create a new todo
|
||||
- record the skip reason in scheduler telemetry
|
||||
|
||||
### Todo Fields
|
||||
|
||||
Scheduled-created todos should include at minimum these persisted fields:
|
||||
|
||||
- `title`: required
|
||||
- `description`: required
|
||||
- `source_type`: required primary provenance type
|
||||
- `source_id`: optional stable source id
|
||||
- `source_label`: required fallback human-readable provenance label
|
||||
- `created_by`: required and set to `schedule_planner`
|
||||
- `created_at`: required timestamp
|
||||
- `priority`: optional normalized priority
|
||||
- `target_window`: optional normalized scheduling window
|
||||
|
||||
## Routing Boundaries
|
||||
|
||||
The system must distinguish scheduling from adjacent planning behaviors.
|
||||
|
||||
### Route to `schedule_planner` when the user asks for:
|
||||
|
||||
- today or this week planning
|
||||
- what to focus on next
|
||||
- priority ordering across ongoing work
|
||||
- time-aware sequencing of current commitments
|
||||
|
||||
### Do not route to `schedule_planner` when the user asks for:
|
||||
|
||||
- deep implementation planning for a feature
|
||||
- code execution or task fulfillment
|
||||
- research-only retrieval
|
||||
- pure analysis without scheduling intent
|
||||
|
||||
In ambiguous cases such as "what should I do next?", prefer `schedule_planner` when the available context includes multiple active tasks, recent commitments, or forum pressure signals.
|
||||
|
||||
## Backend Changes
|
||||
|
||||
### Role and Graph Layer
|
||||
|
||||
Update the orchestration layer so the planner role is redefined as `schedule_planner` rather than `planner`.
|
||||
|
||||
Files likely involved:
|
||||
|
||||
- `backend/app/agents/state.py`
|
||||
- `backend/app/agents/graph.py`
|
||||
- `backend/app/agents/prompts.py`
|
||||
- `backend/app/routers/agent.py`
|
||||
- `backend/app/services/agent_service.py`
|
||||
|
||||
Required changes:
|
||||
|
||||
- rename role ids where appropriate
|
||||
- update graph node registration
|
||||
- update master routing rules
|
||||
- replace planner subagent mappings
|
||||
- update telemetry and sub-commander trace labels
|
||||
|
||||
### Prompt Layer
|
||||
|
||||
Replace the current planner prompt family with schedule-specific instructions.
|
||||
|
||||
Needed prompt families:
|
||||
|
||||
- `SCHEDULE_PLANNER_SYSTEM_PROMPT`
|
||||
- `SCHEDULE_ANALYSIS_PROMPT`
|
||||
- `SCHEDULE_PLANNING_PROMPT`
|
||||
|
||||
Prompt requirements:
|
||||
|
||||
- reason over conversation history, tasks, and forum state
|
||||
- prioritize urgency, importance, and dependency
|
||||
- avoid abstract productivity advice
|
||||
- produce concrete, immediate scheduling output
|
||||
- in scheduled mode, generate todo-worthy suggestions only
|
||||
|
||||
### Scheduled Execution Layer
|
||||
|
||||
Add or update the daily scheduled workflow so it can call the schedule planner flow automatically.
|
||||
|
||||
Likely touchpoints:
|
||||
|
||||
- scheduler service
|
||||
- existing daily planning jobs
|
||||
- todo creation services
|
||||
|
||||
Required behavior:
|
||||
|
||||
- fixed daily execution time
|
||||
- fetch relevant context
|
||||
- call schedule planner pipeline
|
||||
- write selected recommendations into todos
|
||||
- skip duplicate todo creation
|
||||
|
||||
## Frontend Changes
|
||||
|
||||
Frontend needs to reflect the new role system consistently.
|
||||
|
||||
Files likely involved:
|
||||
|
||||
- `frontend/src/data/agents.ts`
|
||||
- `frontend/src/pages/agents/index.vue`
|
||||
- `frontend/src/components/chat/OrchestrationPanel.vue`
|
||||
- `frontend/src/pages/chat/composables/useChatView.ts`
|
||||
- related frontend tests
|
||||
|
||||
Required updates:
|
||||
|
||||
- replace planner display labels with schedule planner labels
|
||||
- rename planner subagents to schedule analysis / schedule planning
|
||||
- update orchestration telemetry labels
|
||||
- update example mock state and tests
|
||||
- use these exact frontend ids:
|
||||
- `schedule_planner`
|
||||
- `schedule_analysis`
|
||||
- `schedule_planning`
|
||||
- use these exact default Chinese labels:
|
||||
- `日程规划师`
|
||||
- `日程分析员`
|
||||
- `日程编排员`
|
||||
- update active route visualization and commander skill labels to the new ids
|
||||
|
||||
## Naming
|
||||
|
||||
### Main Agent
|
||||
|
||||
- old: `planner`
|
||||
- new: `schedule_planner`
|
||||
- display role: `日程规划师`
|
||||
|
||||
### Subagents
|
||||
|
||||
- old: `planner_scope`
|
||||
- new: `schedule_analysis`
|
||||
- display role: `日程分析员`
|
||||
|
||||
- old: `planner_steps`
|
||||
- new: `schedule_planning`
|
||||
- display role: `日程编排员`
|
||||
|
||||
## Constraints
|
||||
|
||||
- do not keep dual role names for long-term compatibility unless a specific dependency forces it
|
||||
- do not create todos for every suggestion
|
||||
- do not turn the planner into a generic life coach
|
||||
- keep scheduling grounded in current project signals
|
||||
- preserve the existing agent architecture where possible, while fully changing planner semantics
|
||||
|
||||
## Observability
|
||||
|
||||
The redesign must emit enough telemetry to debug routing and scheduled execution.
|
||||
|
||||
Required telemetry fields:
|
||||
|
||||
- selected main route
|
||||
- selected subagent
|
||||
- available input sources
|
||||
- missing input sources
|
||||
- run mode: `interactive` or `scheduled`
|
||||
- confidence level
|
||||
- todos created count
|
||||
- todos skipped as duplicates count
|
||||
- scheduler run success / skipped / failed
|
||||
|
||||
## Acceptance Criteria
|
||||
|
||||
### Backend Acceptance Criteria
|
||||
|
||||
- a scheduling-intent user query routes to `schedule_planner`
|
||||
- `schedule_analysis` and `schedule_planning` are both reachable through the orchestration layer
|
||||
- legacy planner ids are normalized during the compatibility window
|
||||
- daily scheduled runs do not execute more than once per user per local date
|
||||
- low-context daily runs do not create todos
|
||||
- duplicate todo candidates are skipped instead of recreated
|
||||
|
||||
### Frontend Acceptance Criteria
|
||||
|
||||
- the agents page displays `日程规划师` instead of the previous planner label
|
||||
- the planner subagent chips display `日程分析员` and `日程编排员`
|
||||
- orchestration mock data and route highlights use the new ids
|
||||
- tests no longer depend on `planner_scope` or `planner_steps` after migration is complete
|
||||
|
||||
### Failure and Fallback Criteria
|
||||
|
||||
- if forum data is missing, the planner still runs with degraded confidence
|
||||
- if task board data is missing, the planner still runs with degraded confidence when other strong context exists
|
||||
- if fewer than two meaningful signals are available, scheduled mode creates no todos
|
||||
- if the user has no timezone configured, the daily scheduled run is skipped and logged
|
||||
|
||||
## Testing Strategy
|
||||
|
||||
### Backend
|
||||
|
||||
Add or update tests for:
|
||||
|
||||
- master routing to `schedule_planner`
|
||||
- schedule subagent selection behavior
|
||||
- prompt invariants for schedule-focused output
|
||||
- scheduled daily run creates todos from actionable suggestions
|
||||
- duplicate todo protection
|
||||
|
||||
### Frontend
|
||||
|
||||
Add or update tests for:
|
||||
|
||||
- renamed main role and subagent labels
|
||||
- orchestration panel route display
|
||||
- active subagent telemetry
|
||||
- mock agent graph data using `schedule_planner`, `schedule_analysis`, and `schedule_planning`
|
||||
|
||||
## Risks
|
||||
|
||||
1. **Broad rename surface**
|
||||
- `planner` is referenced across backend and frontend, so a full rename must be systematic
|
||||
|
||||
2. **Scheduled todo spam**
|
||||
- daily runs may create low-value or duplicate todos unless filtered carefully
|
||||
|
||||
3. **Prompt drift**
|
||||
- if prompts stay too abstract, the new agent will sound renamed but not actually scheduling-oriented
|
||||
|
||||
## Recommendation
|
||||
|
||||
Implement this as a real role-system redesign, not as a display-only rename. The role id, subagent ids, prompt family, routing logic, and frontend telemetry should all align on the new scheduling semantics so the system remains internally coherent.
|
||||
Reference in New Issue
Block a user