Timeboxing Agent
Stage-gated timeboxing workflow that builds daily schedules via conversational refinement and syncs to Google Calendar.
Status
| Subsystem |
Status |
Tests |
Confirmed |
| Domain models (tb_models, tb_ops) |
Implemented, Tested |
62 unit |
2025-07-22 |
| Sync engine (sync_engine, submitter, calendar_reconciliation) |
Implemented, Tested |
37 + 10 unit |
2026-02-14 (duplicate-prevention reconciliation) |
| Patching (schema-in-prompt) |
Implemented, Tested |
14 unit |
2025-07-22 (live LLM) |
| GraphFlow orchestration |
Implemented, Documented |
graphflow state machine tests |
— |
| Skeleton pre-generation (AC1) |
Implemented, Tested |
test_timeboxing_skeleton_pre_generation.py |
— |
| Calendar sync + undo controls |
Implemented, Tested (submit-time baseline refresh + deterministic reconciliation summary in Stage 5) |
test_timeboxing_submit_flow.py, test_slack_timebox_buttons.py |
2026-03-07 |
| Stage 1 constraint-template coverage UX |
Implemented, Tested (collect-stage calendar anchor merge + anchor summary line) |
test_timeboxing_stage_message_template_coverage.py, test_timeboxing_calendar_prefetch_feedback.py, test_timeboxing_durable_constraints.py |
2026-03-07 |
| Durable profile/date-span constraint auto-upsert + Stage 1 prefetch wait |
Implemented, Tested |
test_timeboxing_durable_constraints.py, test_timeboxing_constraint_memory_client_tool_name.py |
— |
| Constraint-memory MCP payload decoding hardening |
Implemented, Tested |
test_timeboxing_constraint_memory_client_tool_name.py |
— |
| Stage 1 lookup-first defaults + session override suppression |
Implemented, Tested |
test_timeboxing_durable_constraints.py, test_timeboxing_stage_gate_json_context.py |
— |
| Stage 3 markdown-first skeleton overview |
Implemented, Tested |
test_timeboxing_skeleton_draft_contract.py |
— |
| Stage 4 advisory quality facts (0-4) |
Implemented, Tested |
test_phase4_rewiring.py |
— |
| Deterministic stage action buttons |
Implemented, Tested |
test_timeboxing_stage_actions.py, test_slack_timebox_stage_buttons.py |
— |
| Structured-output strict tool contract |
Implemented, Tested |
test_timeboxing_constraint_search_tool_strict.py, test_timeboxing_flow.py |
— |
File Index
Orchestration
| File |
Responsibility |
agent.py |
PlanningCoordinator: owns Session, routes Slack messages, runs background tasks, manages stage transitions. Entry points: on_start(), on_commit_date(), on_user_reply(). |
flow_graph.py |
build_timeboxing_graphflow(): constructs the AutoGen GraphFlow DAG. Single source of truth for stage transitions and edge conditions. |
stage_gating.py |
TimeboxingStage enum, StageGateOutput model, LLM prompt templates for each stage gate. |
contracts.py |
Typed stage-context contracts (SkeletonContext, ConstraintContext, etc.): what each stage receives as input. |
constants.py |
Orchestration timeouts, limits, and fallback values. No magic numbers. |
Domain Models (LLM-Facing)
| File |
Responsibility |
tb_models.py |
ET (event type enum), TBEvent, TBPlan, Timing union (AfterPrev, BeforeNext, FixedStart, FixedWindow), _ET_COLOR_MAP. Calendar-native, sync-friendly. |
tb_ops.py |
TBPatch, TBOp union (AddEvents, RemoveEvent, UpdateEvent, MoveEvent, ReplaceAll), apply_tb_ops(). Pure-function ops engine: deterministic plan mutation. |
timebox.py |
Legacy Timebox schema + schedule_and_validate(). Conversion: timebox_to_tb_plan(), tb_plan_to_timebox(). Kept for backward compat with Stage 3 drafting and Slack display. |
Calendar Sync
| File |
Responsibility |
sync_engine.py |
plan_sync(), execute_sync(), undo_sync(), gcal_response_to_tb_plan(). Deterministic, incremental, reversible diff-and-apply via MCP. Uses reconciliation-first matching and DeepDiff only for matched update decisions. |
calendar_reconciliation.py |
Deterministic desired-vs-remote matching (id -> canonical -> fuzzy) and op-bucket planning (create/update/delete/noop/skip). |
submitter.py |
CalendarSubmitter: high-level submit_plan(), undo_last(), and undo_transaction() over the sync engine. |
mcp_clients.py |
McpCalendarClient (list/create/update/delete events via MCP), McpConstraintMemoryClient (Notion constraint MCP). Internal to coordinator. |
LLM Patching
| File |
Responsibility |
patching.py |
TimeboxPatcher: sends TBPlan + user feedback to Gemini via AssistantAgent. Injects TBPatch JSON schema into system prompt (not output_content_type, which breaks on oneOf). _extract_patch() strips markdown fences. |
Prompt Engineering
| File |
Responsibility |
skeleton_draft_system_prompt.j2 |
Jinja2 template for skeleton drafting (consumes TOON tables). |
prompt_rendering.py |
render_skeleton_draft_system_prompt(): Jinja renderer. |
planning_policy.py |
Shared Stage 3/4 planning policy text + quality rubric constants. |
toon_views.py |
Timeboxing-specific TOON table views (minimal columns for events, constraints, tasks). |
prompts.py |
Legacy prompt strings (being migrated to stage_gating.py). |
NLU and Constraints
| File |
Responsibility |
nlu.py |
PlannedDateResult, ConstraintInterpretation: structured LLM outputs for multilingual date/scope inference. No regex/keyword matching. |
preferences.py |
ConstraintStore: SQLite-backed session constraint persistence. |
constraint_retriever.py |
ConstraintRetriever: gap-driven durable constraint fetch from Notion MCP. |
constraint_search_tool.py |
Stage-gating Notion search tool (search_constraints) with strict FunctionTool schema for structured-output compatibility. |
notion_constraint_extractor.py |
TODO(deprecate) — dead code. The Notion-MCP extraction path is never reached; the live write path is _upsert_constraints_to_durable_store. Do not import from new code. |
Utilities
| File |
Responsibility |
pydantic_parsing.py |
Tolerant parsing helpers for LLM outputs and mixed payloads. |
messages.py |
StartTimeboxing, TimeboxingUserReply, TimeboxingCommitDate: typed Slack-to-agent routing messages. |
actions.py |
Slack action/button payload models and helpers (planning cards). |
state.py |
Session persistence helpers. |
flow.py |
Legacy flow logic (being replaced by GraphFlow). |
Subfolders
| Folder |
Responsibility |
nodes/ |
GraphFlow node agents (TurnInit, Decision, Transition, Stage nodes, Presenter). See nodes/README.md. |
Architecture
Coordinator + Stage Agents
- Coordinator (
agent.py): owns Session state, merges facts/constraints, runs background tool work (calendar, Notion), and decides which stage runs next.
- Stage agents (in
nodes/): pure functions over typed JSON input returning typed JSON output. No direct tool IO.
- GraphFlow (
flow_graph.py): runs the stage machine as a directed graph; transitions are testable and explicit.
Stage Pipeline
Stage 0: Date Confirmation (Slack buttons)
background: calendar prefetch + Notion constraint retrieval (with short await before first Stage 1 render)
Stage 1: CollectConstraints -> StageGateOutput (frame_facts)
Stage 2: CaptureInputs -> StageGateOutput (input_facts)
Stage 3: Skeleton -> pre-generated draft if available, else synchronous draft -> markdown overview (presentation-first) + carry-forward seed `TBPlan` (no Stage 3 patch loop)
Stage 4: Refine -> prompt-guided tool orchestration (`timebox_patch_and_sync` primary, `memory_extract_and_upsert` optional background) -> advisory quality facts (0-4) -> sync to Google Calendar with explicit changed/unchanged reporting
Stage 5: ReviewCommit -> final summary; user corrections route back to Stage 4 Refine in the same turn
Undo action: undo latest sync transaction via session-backed state -> return to Refine
Each stage response also includes deterministic Slack actions (Back/Redo/Cancel, plus Proceed when ready). After a local Refine update is applied, the control row swaps Proceed for "Undo last update" so users can immediately revert without an extra stage-advance click.
Session State
Session dataclass lives in agent.py. Core fields:
| Field |
Purpose |
thread_ts, channel_id, user_id |
Slack anchors |
frame_facts, input_facts |
Accumulated LLM outputs per stage |
timebox |
Legacy Timebox (Stage 3+) |
tb_plan |
Current TBPlan, sync-engine model (prepared by Stage 2 draft and/or Stage 4 preflight) |
base_snapshot |
Remote-baseline snapshot for diff-based sync (prepared in Stage 4 preflight) |
event_id_map |
Dict mapping event key to GCal event ID |
prefetched_remote_snapshots_by_date |
Rich remote baseline snapshots (from list-events) keyed by date |
remote_event_ids_by_index |
Ordered remote event IDs aligned with base_snapshot.resolve_times() |
pre_generated_skeleton |
Background Stage 2 draft consumed by Stage 3 when fresh |
skeleton_overview_markdown |
Stage 3 markdown summary rendered to Slack |
last_quality_level, last_quality_label, last_quality_next_step |
Latest Refine quality snapshot carried into next patch context |
last_sync_transaction |
Session-backed transaction used for deterministic undo |
last_refine_undo_tb_plan, last_refine_undo_timebox |
Session-backed local draft snapshot used by "Undo last update" in Refine |
active_constraints |
Merged constraint state |
stage |
Current TimeboxingStage enum |
graphflow |
Per-session GraphFlow instance |
Model Hierarchy
LLM-facing: TBEvent -> TBPlan -> TBPatch -> apply_tb_ops()
Sync engine: TBPlan -> plan_sync() -> SyncOp[] -> execute_sync()
Calendar MCP: SyncOp -> create-event / update-event / delete-event
Persistence: CalendarEvent (SQLModel) for DB + Slack display
Conversion: timebox_to_tb_plan() / tb_plan_to_timebox()
Event Identity
- Agent-created events get deterministic base32hex IDs:
fftb + SHA1(date|name|start|index).
fftb* prefix = owned, eligible for update/delete.
- No prefix = foreign (user calendar), read-only FixedWindow constraints.
Patching (Schema-in-Prompt)
output_content_type=TBPatch is intentionally NOT used because OpenAI response_format rejects oneOf from Pydantic discriminated unions and OpenRouter/Gemini hangs with structured output on complex schemas. Instead: inject TBPatch.model_json_schema() into the system prompt and parse the raw JSON text response.
Runtime guard: TimeboxPatcher.apply_patch(...) is Refine-only (stage='Refine') and rejects any non-Refine invocation.
TOON Prompt Injection
List-shaped data (constraints, tasks, immovables, events) uses TOON tabular format, not JSON arrays. Encoder: src/fateforger/llm/toon.py.
| File |
Role |
src/fateforger/slack_bot/handlers.py |
Routes Slack events to coordinator |
src/fateforger/slack_bot/timeboxing_commit.py |
Stage 0 Slack UI (day picker + confirm button) |
src/fateforger/llm/toon.py |
TOON tabular encoder |
TICKET_SYNC_ENGINE.md |
Implementation ticket (repo root) |
notebooks/phase5_integration_test.ipynb |
Live MCP + LLM integration tests |
How to Run Tests
# Sync engine suite (115 tests)
poetry run pytest tests/unit/test_tb_models.py tests/unit/test_tb_ops.py \
tests/unit/test_sync_engine.py tests/unit/test_phase4_rewiring.py \
tests/unit/test_patching.py -v
# GraphFlow state machine
poetry run pytest tests/unit/test_timeboxing_graphflow_state_machine.py -v
# All timeboxing-related tests
poetry run pytest tests/unit/ -k timeboxing -v