Skip to content

Timeboxing Agent

Stage-gated timeboxing workflow that builds daily schedules via conversational refinement and syncs to Google Calendar.

Status

Subsystem Status Tests Confirmed
Domain models (tb_models, tb_ops) Implemented, Tested 62 unit 2025-07-22
Sync engine (sync_engine, submitter, calendar_reconciliation) Implemented, Tested 37 + 10 unit 2026-02-14 (duplicate-prevention reconciliation)
Patching (schema-in-prompt) Implemented, Tested 14 unit 2025-07-22 (live LLM)
GraphFlow orchestration Implemented, Documented graphflow state machine tests
Skeleton pre-generation (AC1) Implemented, Tested test_timeboxing_skeleton_pre_generation.py
Calendar sync + undo controls Implemented, Tested (submit-time baseline refresh + deterministic reconciliation summary in Stage 5) test_timeboxing_submit_flow.py, test_slack_timebox_buttons.py 2026-03-07
Stage 1 constraint-template coverage UX Implemented, Tested (collect-stage calendar anchor merge + anchor summary line) test_timeboxing_stage_message_template_coverage.py, test_timeboxing_calendar_prefetch_feedback.py, test_timeboxing_durable_constraints.py 2026-03-07
Durable profile/date-span constraint auto-upsert + Stage 1 prefetch wait Implemented, Tested test_timeboxing_durable_constraints.py, test_timeboxing_constraint_memory_client_tool_name.py
Constraint-memory MCP payload decoding hardening Implemented, Tested test_timeboxing_constraint_memory_client_tool_name.py
Stage 1 lookup-first defaults + session override suppression Implemented, Tested test_timeboxing_durable_constraints.py, test_timeboxing_stage_gate_json_context.py
Stage 3 markdown-first skeleton overview Implemented, Tested test_timeboxing_skeleton_draft_contract.py
Stage 4 advisory quality facts (0-4) Implemented, Tested test_phase4_rewiring.py
Deterministic stage action buttons Implemented, Tested test_timeboxing_stage_actions.py, test_slack_timebox_stage_buttons.py
Structured-output strict tool contract Implemented, Tested test_timeboxing_constraint_search_tool_strict.py, test_timeboxing_flow.py

File Index

Orchestration

File Responsibility
agent.py PlanningCoordinator: owns Session, routes Slack messages, runs background tasks, manages stage transitions. Entry points: on_start(), on_commit_date(), on_user_reply().
flow_graph.py build_timeboxing_graphflow(): constructs the AutoGen GraphFlow DAG. Single source of truth for stage transitions and edge conditions.
stage_gating.py TimeboxingStage enum, StageGateOutput model, LLM prompt templates for each stage gate.
contracts.py Typed stage-context contracts (SkeletonContext, ConstraintContext, etc.): what each stage receives as input.
constants.py Orchestration timeouts, limits, and fallback values. No magic numbers.

Domain Models (LLM-Facing)

File Responsibility
tb_models.py ET (event type enum), TBEvent, TBPlan, Timing union (AfterPrev, BeforeNext, FixedStart, FixedWindow), _ET_COLOR_MAP. Calendar-native, sync-friendly.
tb_ops.py TBPatch, TBOp union (AddEvents, RemoveEvent, UpdateEvent, MoveEvent, ReplaceAll), apply_tb_ops(). Pure-function ops engine: deterministic plan mutation.
timebox.py Legacy Timebox schema + schedule_and_validate(). Conversion: timebox_to_tb_plan(), tb_plan_to_timebox(). Kept for backward compat with Stage 3 drafting and Slack display.

Calendar Sync

File Responsibility
sync_engine.py plan_sync(), execute_sync(), undo_sync(), gcal_response_to_tb_plan(). Deterministic, incremental, reversible diff-and-apply via MCP. Uses reconciliation-first matching and DeepDiff only for matched update decisions.
calendar_reconciliation.py Deterministic desired-vs-remote matching (id -> canonical -> fuzzy) and op-bucket planning (create/update/delete/noop/skip).
submitter.py CalendarSubmitter: high-level submit_plan(), undo_last(), and undo_transaction() over the sync engine.
mcp_clients.py McpCalendarClient (list/create/update/delete events via MCP), McpConstraintMemoryClient (Notion constraint MCP). Internal to coordinator.

LLM Patching

File Responsibility
patching.py TimeboxPatcher: sends TBPlan + user feedback to Gemini via AssistantAgent. Injects TBPatch JSON schema into system prompt (not output_content_type, which breaks on oneOf). _extract_patch() strips markdown fences.

Prompt Engineering

File Responsibility
skeleton_draft_system_prompt.j2 Jinja2 template for skeleton drafting (consumes TOON tables).
prompt_rendering.py render_skeleton_draft_system_prompt(): Jinja renderer.
planning_policy.py Shared Stage 3/4 planning policy text + quality rubric constants.
toon_views.py Timeboxing-specific TOON table views (minimal columns for events, constraints, tasks).
prompts.py Legacy prompt strings (being migrated to stage_gating.py).

NLU and Constraints

File Responsibility
nlu.py PlannedDateResult, ConstraintInterpretation: structured LLM outputs for multilingual date/scope inference. No regex/keyword matching.
preferences.py ConstraintStore: SQLite-backed session constraint persistence.
constraint_retriever.py ConstraintRetriever: gap-driven durable constraint fetch from Notion MCP.
constraint_search_tool.py Stage-gating Notion search tool (search_constraints) with strict FunctionTool schema for structured-output compatibility.
notion_constraint_extractor.py TODO(deprecate) — dead code. The Notion-MCP extraction path is never reached; the live write path is _upsert_constraints_to_durable_store. Do not import from new code.

Utilities

File Responsibility
pydantic_parsing.py Tolerant parsing helpers for LLM outputs and mixed payloads.
messages.py StartTimeboxing, TimeboxingUserReply, TimeboxingCommitDate: typed Slack-to-agent routing messages.
actions.py Slack action/button payload models and helpers (planning cards).
state.py Session persistence helpers.
flow.py Legacy flow logic (being replaced by GraphFlow).

Subfolders

Folder Responsibility
nodes/ GraphFlow node agents (TurnInit, Decision, Transition, Stage nodes, Presenter). See nodes/README.md.

Architecture

Coordinator + Stage Agents

  • Coordinator (agent.py): owns Session state, merges facts/constraints, runs background tool work (calendar, Notion), and decides which stage runs next.
  • Stage agents (in nodes/): pure functions over typed JSON input returning typed JSON output. No direct tool IO.
  • GraphFlow (flow_graph.py): runs the stage machine as a directed graph; transitions are testable and explicit.

Stage Pipeline

Stage 0: Date Confirmation (Slack buttons)
    background: calendar prefetch + Notion constraint retrieval (with short await before first Stage 1 render)
Stage 1: CollectConstraints -> StageGateOutput (frame_facts)
Stage 2: CaptureInputs -> StageGateOutput (input_facts)
Stage 3: Skeleton -> pre-generated draft if available, else synchronous draft -> markdown overview (presentation-first) + carry-forward seed `TBPlan` (no Stage 3 patch loop)
Stage 4: Refine -> prompt-guided tool orchestration (`timebox_patch_and_sync` primary, `memory_extract_and_upsert` optional background) -> advisory quality facts (0-4) -> sync to Google Calendar with explicit changed/unchanged reporting
Stage 5: ReviewCommit -> final summary; user corrections route back to Stage 4 Refine in the same turn
Undo action: undo latest sync transaction via session-backed state -> return to Refine
Each stage response also includes deterministic Slack actions (Back/Redo/Cancel, plus Proceed when ready). After a local Refine update is applied, the control row swaps Proceed for "Undo last update" so users can immediately revert without an extra stage-advance click.

Session State

Session dataclass lives in agent.py. Core fields:

Field Purpose
thread_ts, channel_id, user_id Slack anchors
frame_facts, input_facts Accumulated LLM outputs per stage
timebox Legacy Timebox (Stage 3+)
tb_plan Current TBPlan, sync-engine model (prepared by Stage 2 draft and/or Stage 4 preflight)
base_snapshot Remote-baseline snapshot for diff-based sync (prepared in Stage 4 preflight)
event_id_map Dict mapping event key to GCal event ID
prefetched_remote_snapshots_by_date Rich remote baseline snapshots (from list-events) keyed by date
remote_event_ids_by_index Ordered remote event IDs aligned with base_snapshot.resolve_times()
pre_generated_skeleton Background Stage 2 draft consumed by Stage 3 when fresh
skeleton_overview_markdown Stage 3 markdown summary rendered to Slack
last_quality_level, last_quality_label, last_quality_next_step Latest Refine quality snapshot carried into next patch context
last_sync_transaction Session-backed transaction used for deterministic undo
last_refine_undo_tb_plan, last_refine_undo_timebox Session-backed local draft snapshot used by "Undo last update" in Refine
active_constraints Merged constraint state
stage Current TimeboxingStage enum
graphflow Per-session GraphFlow instance

Model Hierarchy

LLM-facing:      TBEvent -> TBPlan -> TBPatch -> apply_tb_ops()
Sync engine:     TBPlan -> plan_sync() -> SyncOp[] -> execute_sync()
Calendar MCP:    SyncOp -> create-event / update-event / delete-event
Persistence:     CalendarEvent (SQLModel) for DB + Slack display
Conversion:      timebox_to_tb_plan() / tb_plan_to_timebox()

Event Identity

  • Agent-created events get deterministic base32hex IDs: fftb + SHA1(date|name|start|index).
  • fftb* prefix = owned, eligible for update/delete.
  • No prefix = foreign (user calendar), read-only FixedWindow constraints.

Patching (Schema-in-Prompt)

output_content_type=TBPatch is intentionally NOT used because OpenAI response_format rejects oneOf from Pydantic discriminated unions and OpenRouter/Gemini hangs with structured output on complex schemas. Instead: inject TBPatch.model_json_schema() into the system prompt and parse the raw JSON text response.

Runtime guard: TimeboxPatcher.apply_patch(...) is Refine-only (stage='Refine') and rejects any non-Refine invocation.

TOON Prompt Injection

List-shaped data (constraints, tasks, immovables, events) uses TOON tabular format, not JSON arrays. Encoder: src/fateforger/llm/toon.py.

File Role
src/fateforger/slack_bot/handlers.py Routes Slack events to coordinator
src/fateforger/slack_bot/timeboxing_commit.py Stage 0 Slack UI (day picker + confirm button)
src/fateforger/llm/toon.py TOON tabular encoder
TICKET_SYNC_ENGINE.md Implementation ticket (repo root)
notebooks/phase5_integration_test.ipynb Live MCP + LLM integration tests

How to Run Tests

# Sync engine suite (115 tests)
poetry run pytest tests/unit/test_tb_models.py tests/unit/test_tb_ops.py \
  tests/unit/test_sync_engine.py tests/unit/test_phase4_rewiring.py \
  tests/unit/test_patching.py -v

# GraphFlow state machine
poetry run pytest tests/unit/test_timeboxing_graphflow_state_machine.py -v

# All timeboxing-related tests
poetry run pytest tests/unit/ -k timeboxing -v