📄 silent-observer-ux-spec.md 9,544 bytes Apr 30, 2026 📋 Raw

Silent Observer UX Spec — Speak Message Design

Status: Draft v1.0
Date: 2026-04-30
Author: Daedalus (Frontend/UX Architecture)
Owner: Wadsworth (Chief of Staff)
Scope: Phase 8.3 — Redline Rules UX

Executive Summary

This document defines the UX constraints for Silent Observer "Speak" messages. Director has mandated radical brevity — messages must be readable in a single glance on a mobile lock screen.

Design Philosophy:
- Less is more. Every word earns its place.
- The message should be scannable in <2 seconds.
- Emoji signals urgency, not decoration.

1. Brevity Constraint System

Hard Limits

Constraint	Value	Rationale
Maximum words	12	Lock screen preview limit
Maximum clauses	2 (ideally 1)	Cognitive load
Emoji usage	Urgent only	Reserve ⚠️ 🚨 for conflicts requiring action
No prefix labels	No "Clarification:", "Question:"	Wastes words
No trailing periods	Fragments acceptable	Saves characters

Template Library

Rule 1: Temporal Conflict

[Who] [Activity] is [Time] — you said [OtherTime]. Clarify?

Example:

Leo's soccer is 4:30pm — you said 3pm. New schedule?

Variants:
| Variant | Message |
|---------|---------|
| Different day | Leo's soccer Thursday — you said Friday. Which day? |
| Different time | Harper's ballet 4pm — you said 3pm. Clarify? |
| Different activity | Leo has soccer — you said chess. Which activity? |

Rule 2: Resource Conflict

[Who] and [Who] both claim [Task]. Who's covering?

Example:

John and Sarah both claim Tuesday pickup. Who's covering?

Variants:
| Variant | Message |
|---------|---------|
| Single parent claimed twice | John said he's covering Tuesday twice. Confirm? |
| Unclear who covers | Tuesday pickup claimed by both. Who's getting kids? |
| Task swap | Sarah asked John to swap — John agreed. Confirmed? |

Rule 3: Missing Critical Variable

[Who] [Activity] missing [Variable]. Fill in?

Example:

Leo soccer Thursday missing pickup. Who's covering?

Variants:
| Variant | Message |
|---------|---------|
| Missing time | Leo soccer Thursday missing time. What time? |
| Missing location | Harper pickup Thursday missing where. Location? |
| Missing child | Thursday 3pm pickup — which kid? |
| Missing date | Leo soccer pickup — which day? |

Formatting Rules

Telegram Markdown:
- Bold conflicting information (times, dates, names that don't match)
- Plain text for context
- No italics (renders inconsistently on mobile)

Example:

Leo's soccer is **4:30pm** — you said **3pm**. New schedule?

2. Notification Design Spec

Mobile Lock Screen Preview

Platform	Behavior
iOS	Show full message in preview (≤12 words fits)
Android	No truncation at 12 words
Telegram	Inline display, bold rendering supported

Design Goal: Recipient reads entire message without unlocking phone.

Quiet Hours

SPEAK_QUIET_HOURS = {
    "start": "07:00",    # No messages before 7 AM
    "end": "22:00",      # No messages after 10 PM
    "timezone": "America/Chicago"  # Matt/Aundrea timezone
}

Behavior:
- Conflict detected during quiet hours → Queue until 7 AM
- Urgent flag can override (manual parent trigger only)

Frequency Cap

Limit	Value	Behavior
Soft cap	2 Speak messages/day	Silently drop 3rd+ unless urgent
Reset	Midnight local time	Counter resets at 00:00 CST
Override	Manual parent flag	"Speak anyway" for true emergencies

Rationale: Prevents notification fatigue. Better to miss a low-priority conflict than train users to ignore the bot.

3. UX Acceptance Gate

Success Metric

"Aundrea has never asked 'why did the bot just say that?' for 7 consecutive days"

This is the only metric that matters. All technical metrics (precision, recall, latency) serve this single goal.

Testing Protocol

Phase 1: Silent Observation (Week 1)

Mode: SHADOW
- Tripwire runs normally
- Tier 2 extraction runs
- Redline rules execute
- DECISIONS LOGGED BUT NOT SENT
- Store "would have sent" messages in `shadow_speak` table

Phase 2: Daily Review

Daily 8 PM CST: Wadsworth reviews shadow_speak
- How many messages would have been sent?
- Are they all justified?
- Would any have confused Aundrea?

Phase 3: Live Gate

After 7 days of clean shadow logs:
- Enable live Speak mode
- Continue logging all decisions
- Daily review continues for 2 weeks

Failure Mode

Trigger	Action
Aundrea questions a Speak message	Immediately pause live mode
Return to shadow mode for 3 days	Debug, adjust templates
Re-attempt live mode	If 3 failures, escalate to Matt

Log Schema

CREATE TABLE shadow_speak (
    id INTEGER PRIMARY KEY,
    message_id TEXT NOT NULL,
    would_send TEXT NOT NULL,        -- The Speak message that would be sent
    redline_rule TEXT NOT NULL,      -- Which rule triggered
    confidence REAL NOT NULL,
    created_at TIMESTAMP,
    reviewed_at TIMESTAMP,             -- When Wadsworth reviewed
    approved BOOLEAN,                -- Would this have been okay?
    notes TEXT                       -- Human notes on quality
);

4. Test Message Library

10 Speak Message Examples

Time Conflict Variants

TC-1: Wrong time (standard)

Trigger: "I'll pick up Leo at 3pm for soccer"
Brain: Leo's soccer is 4:30pm
Speak: Leo's soccer is **4:30pm** — you said **3pm**. New schedule?
Word count: 9

TC-2: Wrong day

Trigger: "Leo has soccer Friday"
Brain: Leo's soccer is Tuesday/Thursday
Speak: Leo's soccer is **Tuesday/Thursday** — you said **Friday**. Which day?
Word count: 10

Missing Variable Variants

MV-1: Missing pickup person

Trigger: "Leo needs pickup Thursday after early dismissal"
Brain: No assigned_to found, activity=unspecified
Speak: Thursday pickup missing who's covering. You or Aundrea?
Word count: 10

MV-2: Missing activity

Trigger: "I'll get Leo at 3pm tomorrow"
Brain: 3pm has no scheduled activity
Speak: Leo pickup **3pm tomorrow** — what's the activity?
Word count: 8

Resource Conflict Variants

RC-1: Double-booked parent (different messages)

Trigger: John: "I'll cover Tuesday" + Sarah: "I'll cover Tuesday"
Brain: Both assigned to same task
Speak: John and Sarah both claim **Tuesday pickup**. Who's covering?
Word count: 10

RC-2: Double-booked parent (same message thread)

Trigger: John: "I'll get kids" / Sarah: "No I'll get them"
Brain: Conflicting assignments in thread
Speak: John and Sarah both claim **kids pickup**. Who's covering?
Word count: 10

False Positive (Should Stay Silent)

FP-1: General chat about past event

Trigger: "Remember when Leo's soccer game got rained out?"
Tripwire: Catches "Leo", "soccer"
Redline: Past tense detected ("remember", "rained out")
Decision: SILENT
Reason: Referencing past event, not coordination

Ambiguous (Needs Human Clarification)

AM-1: Unclear who "we" refers to

Trigger: "We'll pick up Leo after practice"
Brain: "we" could be John+Sarah, or John+friend, or Sarah+parent
Redline: Missing critical variable (who is "we")
Speak: "**We**" picking up Leo — who does that include?
Word count: 9

Multi-Message Context (Reply to Thread)

MM-1: Clarification requested, reply comes in

Original: Speak message sent: "Leo's soccer **4:30pm** — you said **3pm**. New schedule?"
Reply: "Oops, dentist at 3pm then soccer"
Trigger: New information resolves conflict
Speak: Got it — **dentist 3pm**, then **soccer 4:30pm**. Locked in?
Word count: 9
Note: Reply to original Speak thread, not new message

5. Implementation Notes

Code Constants

SPEAK_CONSTRAINTS = {
    "max_words": 12,
    "max_clauses": 2,
    "emoji_urgent_only": True,
    "allowed_emoji": ["⚠️", "🚨"],  # Conflict/urgent only
    "quiet_hours_start": "07:00",
    "quiet_hours_end": "22:00",
    "daily_limit": 2,
    "timezone": "America/Chicago"
}

SPEAK_TEMPLATES = {
    "temporal_conflict": "{who}'s {activity} is **{correct}** — you said **{incorrect}**. {action}?",
    "resource_conflict": "{who1} and {who2} both claim **{task}**. Who's covering?",
    "missing_variable": "{who} {activity} missing **{variable}**. {prompt}?",
}

Word Count Validation

def validate_speak_message(message: str) -> tuple[bool, str]:
    words = message.split()
    if len(words) > 12:
        return False, f"Message too long: {len(words)} words (max 12)"

    # Count clauses (rough: split on commas, em-dashes, periods)
    clauses = len(re.split(r'[,—.!?]', message)) - 1
    if clauses > 2:
        return False, f"Too many clauses: {clauses} (max 2)"

    return True, "OK"

6. File References

File	Purpose
`shared/project-docs/silent-observer-integration-spec.md`	Full Phase 8 architecture
`shared/project-docs/icarus-phase-8-silent-observer.md`	Phase 8 high-level vision
`shared/project-docs/TEST_FAMILY_CONTEXT.md`	Miller family test data

The best Speak message is the one that didn't need to be read twice.

← Back