Files
JARVIS/docs/superpowers/specs/2026-03-25-schedule-planner-design.md

562 lines
18 KiB
Markdown
Raw Normal View History

# Schedule Planner Agent Redesign
## Goal
Replace the current planner role with a schedule-focused planning system that analyzes conversation history, the task board, and forum signals to produce actionable scheduling recommendations for the user.
## Scope
This redesign covers both the main planner role and its subagents across backend orchestration, prompts, routing, scheduled execution, todo generation, frontend presentation, and related tests.
## User-Approved Direction
- Replace the current path-planning semantics with schedule-planning semantics.
- Redesign both the main planner role and its subagents.
- Inputs for planning:
- conversation history
- task board
- forum information
- Output style:
- conclusion first
- executable schedule next
- Trigger modes:
- when the user explicitly asks for scheduling advice
- at a fixed daily time
- Daily scheduled analysis should write actionable suggestions into todo items.
## Architecture
### Main Role
The current `planner` role will be replaced at the system level by a new role id:
- `schedule_planner`
Its responsibility is no longer “find the shortest execution path for a goal.” Instead, it becomes the scheduling brain that:
1. understands current commitments and pressure signals
2. evaluates urgency, importance, dependency, and timing
3. recommends near-term scheduling actions
4. converts useful scheduled guidance into concrete todo items when triggered by the daily scheduler
### Subagents
The existing planner subagent structure will be redesigned into two schedule-specific subagents:
- `schedule_analysis`
- analyzes conversation history, task board state, and forum signals
- identifies priorities, pressure points, conflicts, dependencies, risks, and things that can be delayed
- `schedule_planning`
- converts analysis into an execution-oriented schedule recommendation
- outputs conclusion first, then a practical schedule proposal
- when running from the daily scheduled workflow, produces todo-ready action items
### Trigger Paths
#### Interactive Trigger
When the user asks questions such as:
- what should I do today
- how should I arrange this week
- based on my recent work, what should I focus on next
- help me schedule upcoming work
The master agent should route to `schedule_planner`.
The expected response shape:
1. current conclusion
2. today / near-term schedule recommendation
3. next actions
#### Daily Scheduled Trigger
A daily scheduled job invokes the schedule planner flow automatically.
The daily run should:
1. collect relevant context from conversation history, tasks, and forum data
2. run `schedule_analysis`
3. run `schedule_planning`
4. convert only actionable, non-duplicate recommendations into todo items
The daily run should not dump raw analysis into todos. Only concise, action-worthy, user-meaningful recommendations become todos.
## Data Flow
### Inputs
The schedule planning system should read from three sources:
1. **Conversation history**
- recent user intent
- commitments implied in prior discussion
- stated priorities, urgency, and unresolved threads
2. **Task board**
- open items
- current statuses
- stalled work
- high-priority or overdue work
3. **Forum information**
- new items requiring attention
- external pressure or discussion signals
- updates that may change priority
### Internal Processing
The main flow should be:
- Master decides scheduling intent
- `schedule_planner` receives context
- `schedule_analysis` identifies priority structure
- `schedule_planning` produces human-usable output
- scheduled mode additionally writes selected suggestions into todos
### Outputs
#### Interactive Output
The default answer structure should be:
- conclusion first
- suggested schedule second
- next actions last
#### Scheduled Output
The scheduled run should create todo entries with:
- concise action phrasing
- enough context to be actionable
- source attribution where useful (conversation/task/forum)
- duplicate avoidance
## Migration Strategy
This redesign uses a two-phase migration to avoid breaking stored state and UI rendering.
### Phase 1: Compatibility Window
- accept legacy `planner` values from stored traces, mock payloads, and historical records
- normalize legacy `planner` to `schedule_planner` at read boundaries where practical
- accept legacy `planner_scope` and `planner_steps` as read-only legacy values and normalize them to `schedule_analysis` and `schedule_planning`
- write only the new ids going forward:
- `schedule_planner`
- `schedule_analysis`
- `schedule_planning`
### Phase 2: Legacy Removal
After the migration is complete and all active UI payloads, mock data, and tests are updated:
- remove legacy id acceptance from orchestration and frontend display logic
- remove legacy mock fixtures
- keep migration code out of prompts and core scheduling behavior
### Migration Scope
The migration must cover:
- backend enums and routing
- frontend agent ids and telemetry labels
- stored trace rendering paths
- mock data used by agent dashboards and chat orchestration views
- tests that still refer to `planner`, `planner_scope`, or `planner_steps`
## Input Contracts
The schedule planning system reads from three sources with explicit limits.
### Conversation History Contract
- use recent conversation history from the current user context
- default retrieval window: last 7 days of relevant conversation turns, capped at the latest 50 turns
- prefer turns that include commitments, priorities, deadlines, blockers, or future-oriented intent
- if conversation history is unavailable, continue with degraded confidence
### Task Board Contract
- include open, in-progress, blocked, overdue, and high-priority tasks
- exclude completed and archived items by default
- include enough task metadata to reason about urgency and dependency:
- title
- status
- priority
- due date if present
- last updated time if present
- if task data is unavailable, continue with degraded confidence
### Forum Information Contract
- include recent forum items that may affect user priorities
- default retrieval window: last 7 days of relevant forum signals
- forum signals may include:
- new posts requiring attention
- replies or escalations
- updates that change urgency or expected follow-up
- if forum data is unavailable, continue with degraded confidence
## Output Contracts
### `schedule_analysis` Output Schema
The analysis stage should produce a structured summary with these fields:
- `top_priorities`: list of current highest-priority focus areas
- `risks`: list of risk or pressure signals
- `conflicts`: list of timing or dependency conflicts
- `deferrable_items`: list of lower-priority items that can be delayed
- `evidence`: source references grouped by `conversation`, `task_board`, or `forum`
- `confidence`: one of `high`, `medium`, `low`
### `schedule_planning` Output Schema
The planning stage should produce a structured recommendation with these fields:
- `conclusion`: short decision-oriented summary
- `today_plan`: list of suggested actions for the current day or immediate next window
- `near_term_plan`: list of actions for the next few days or current week
- `next_actions`: short ordered action list
- `todo_candidates`: only present in scheduled mode; candidate todo items derived from the recommendation
- `confidence`: one of `high`, `medium`, `low`
### `todo_candidates` Schema
Each `todo_candidate` must use this structure:
- `title`: required short action text
- `description`: required short rationale grounded in source context
- `sources`: required list of provenance objects
- `priority`: optional normalized priority such as `high`, `medium`, `low`
- `target_window`: optional string such as `today` or `this_week`
Each provenance object in `sources` must contain:
- `type`: one of `conversation`, `task_board`, `forum`
- `id`: source object id when available, otherwise a stable synthetic reference
- `label`: short human-readable source label
### Evidence Structure
Each item in `schedule_analysis.evidence` must contain:
- `type`: one of `conversation`, `task_board`, `forum`
- `id`: source object id when available, otherwise a stable synthetic reference
- `label`: short human-readable identifier
- `reason`: brief explanation of why the signal matters to scheduling
### Interactive Response Contract
The user-facing answer should always follow this shape:
1. conclusion
2. suggested schedule
3. next actions
If confidence is low, the response must say that explicitly and avoid overconfident scheduling language.
## Daily Scheduler Contract
The daily scheduled trigger must follow explicit execution semantics.
### Execution Model
- run once per user per local date
- default execution time: 07:00 in the user's configured timezone
- if the user has no configured timezone, skip the run and log the skip reason
- do not automatically backfill missed runs
- enforce idempotency by `(user_id, local_date, job_type)` so the same daily analysis is not executed more than once successfully
### Scheduled Mode Behavior
A successful scheduled run should:
1. gather available context from the three input sources
2. execute `schedule_analysis`
3. execute `schedule_planning`
4. create todo items from selected `todo_candidates`
5. store run telemetry and outcome metadata
If one or more sources are missing, continue when there is still enough evidence to produce a useful recommendation and mark confidence as reduced.
Signal evaluation rules:
- a **strong source** is a source with enough current evidence to support prioritization on its own, such as multiple open high-priority tasks or a recent forum escalation
- a **meaningful signal** is a discrete scheduling-relevant item extracted from any source, such as an overdue task, a stated commitment in conversation history, or a forum escalation
- the planner may still run with one strong source
- scheduled mode may create todos only when at least two meaningful signals exist across all inputs
If fewer than two meaningful signals are available across all sources, the scheduler should not create todos and should log a low-context outcome.
Delayed execution rule:
- if the 07:00 run is delayed by temporary outage or worker unavailability, the system may still execute one delayed run later on the same user-local date
- if the entire local date passes without a successful run, do not backfill on the next day
## Todo Creation Rules
Todo creation is the main scheduled side effect and must be tightly constrained.
### Creation Rules
- create at most 3 todo items per daily run
- only create todos for actions that are concrete, near-term, and user-actionable
- do not create todos for vague advice, reflections, or duplicated reminders
- store source provenance when available:
- `conversation`
- `task_board`
- `forum`
### Duplicate Detection
A candidate todo is considered a duplicate if there is already an open todo that matches all of the following:
- same normalized action text
- same source category or same source object when available
- created within the last 7 days
Normalization rules for action text:
- trim surrounding whitespace
- collapse repeated internal whitespace to a single space
- lowercase Latin characters
- remove trailing full stop / period punctuation only
Source comparison rules:
- if a provenance object includes a stable source `id`, compare by `(type, id)`
- if no stable source id exists, compare by `(type, normalized label)`
- if multiple sources support one recommendation, compare against the highest-priority provenance in this order: `task_board`, `forum`, `conversation`
When a duplicate is detected:
- do not create a new todo
- record the skip reason in scheduler telemetry
### Todo Fields
Scheduled-created todos should include at minimum these persisted fields:
- `title`: required
- `description`: required
- `source_type`: required primary provenance type
- `source_id`: optional stable source id
- `source_label`: required fallback human-readable provenance label
- `created_by`: required and set to `schedule_planner`
- `created_at`: required timestamp
- `priority`: optional normalized priority
- `target_window`: optional normalized scheduling window
## Routing Boundaries
The system must distinguish scheduling from adjacent planning behaviors.
### Route to `schedule_planner` when the user asks for:
- today or this week planning
- what to focus on next
- priority ordering across ongoing work
- time-aware sequencing of current commitments
### Do not route to `schedule_planner` when the user asks for:
- deep implementation planning for a feature
- code execution or task fulfillment
- research-only retrieval
- pure analysis without scheduling intent
In ambiguous cases such as "what should I do next?", prefer `schedule_planner` when the available context includes multiple active tasks, recent commitments, or forum pressure signals.
## Backend Changes
### Role and Graph Layer
Update the orchestration layer so the planner role is redefined as `schedule_planner` rather than `planner`.
Files likely involved:
- `backend/app/agents/state.py`
- `backend/app/agents/graph.py`
- `backend/app/agents/prompts.py`
- `backend/app/routers/agent.py`
- `backend/app/services/agent_service.py`
Required changes:
- rename role ids where appropriate
- update graph node registration
- update master routing rules
- replace planner subagent mappings
- update telemetry and sub-commander trace labels
### Prompt Layer
Replace the current planner prompt family with schedule-specific instructions.
Needed prompt families:
- `SCHEDULE_PLANNER_SYSTEM_PROMPT`
- `SCHEDULE_ANALYSIS_PROMPT`
- `SCHEDULE_PLANNING_PROMPT`
Prompt requirements:
- reason over conversation history, tasks, and forum state
- prioritize urgency, importance, and dependency
- avoid abstract productivity advice
- produce concrete, immediate scheduling output
- in scheduled mode, generate todo-worthy suggestions only
### Scheduled Execution Layer
Add or update the daily scheduled workflow so it can call the schedule planner flow automatically.
Likely touchpoints:
- scheduler service
- existing daily planning jobs
- todo creation services
Required behavior:
- fixed daily execution time
- fetch relevant context
- call schedule planner pipeline
- write selected recommendations into todos
- skip duplicate todo creation
## Frontend Changes
Frontend needs to reflect the new role system consistently.
Files likely involved:
- `frontend/src/data/agents.ts`
- `frontend/src/pages/agents/index.vue`
- `frontend/src/components/chat/OrchestrationPanel.vue`
- `frontend/src/pages/chat/composables/useChatView.ts`
- related frontend tests
Required updates:
- replace planner display labels with schedule planner labels
- rename planner subagents to schedule analysis / schedule planning
- update orchestration telemetry labels
- update example mock state and tests
- use these exact frontend ids:
- `schedule_planner`
- `schedule_analysis`
- `schedule_planning`
- use these exact default Chinese labels:
- `日程规划师`
- `日程分析员`
- `日程编排员`
- update active route visualization and commander skill labels to the new ids
## Naming
### Main Agent
- old: `planner`
- new: `schedule_planner`
- display role: `日程规划师`
### Subagents
- old: `planner_scope`
- new: `schedule_analysis`
- display role: `日程分析员`
- old: `planner_steps`
- new: `schedule_planning`
- display role: `日程编排员`
## Constraints
- do not keep dual role names for long-term compatibility unless a specific dependency forces it
- do not create todos for every suggestion
- do not turn the planner into a generic life coach
- keep scheduling grounded in current project signals
- preserve the existing agent architecture where possible, while fully changing planner semantics
## Observability
The redesign must emit enough telemetry to debug routing and scheduled execution.
Required telemetry fields:
- selected main route
- selected subagent
- available input sources
- missing input sources
- run mode: `interactive` or `scheduled`
- confidence level
- todos created count
- todos skipped as duplicates count
- scheduler run success / skipped / failed
## Acceptance Criteria
### Backend Acceptance Criteria
- a scheduling-intent user query routes to `schedule_planner`
- `schedule_analysis` and `schedule_planning` are both reachable through the orchestration layer
- legacy planner ids are normalized during the compatibility window
- daily scheduled runs do not execute more than once per user per local date
- low-context daily runs do not create todos
- duplicate todo candidates are skipped instead of recreated
### Frontend Acceptance Criteria
- the agents page displays `日程规划师` instead of the previous planner label
- the planner subagent chips display `日程分析员` and `日程编排员`
- orchestration mock data and route highlights use the new ids
- tests no longer depend on `planner_scope` or `planner_steps` after migration is complete
### Failure and Fallback Criteria
- if forum data is missing, the planner still runs with degraded confidence
- if task board data is missing, the planner still runs with degraded confidence when other strong context exists
- if fewer than two meaningful signals are available, scheduled mode creates no todos
- if the user has no timezone configured, the daily scheduled run is skipped and logged
## Testing Strategy
### Backend
Add or update tests for:
- master routing to `schedule_planner`
- schedule subagent selection behavior
- prompt invariants for schedule-focused output
- scheduled daily run creates todos from actionable suggestions
- duplicate todo protection
### Frontend
Add or update tests for:
- renamed main role and subagent labels
- orchestration panel route display
- active subagent telemetry
- mock agent graph data using `schedule_planner`, `schedule_analysis`, and `schedule_planning`
## Risks
1. **Broad rename surface**
- `planner` is referenced across backend and frontend, so a full rename must be systematic
2. **Scheduled todo spam**
- daily runs may create low-value or duplicate todos unless filtered carefully
3. **Prompt drift**
- if prompts stay too abstract, the new agent will sound renamed but not actually scheduling-oriented
## Recommendation
Implement this as a real role-system redesign, not as a display-only rename. The role id, subagent ids, prompt family, routing logic, and frontend telemetry should all align on the new scheduling semantics so the system remains internally coherent.