📄 INCIDENT_SYSTEM_DESIGN.md 4,799 bytes Apr 23, 2026 📋 Raw

Incident Capture System — Design Document

Problem Statement

The V2 content pipeline produces fiction when fed hallucinated briefs. Real incident data produces authentic, grounded content. We need a system to capture real struggles as they happen for later content generation.

Architecture

Incident Happens  Log to incidents/  Convert to brief  Feed to pipeline
                                                         
   CLI tool                                         Generate content
                                                         
   Matt types `log-incident`                      Human reviews, publishes

File Structure

~/.openclaw/workspace-socrates/incidents/
├── 2026-04-23-cloudflare-error-1033.md      # Real incident
├── 2026-04-22-magic-wand-auth-failure.md    # Another real incident
└── 2026-04-20-radicale-migration.md          # Migration struggle

Format: YAML frontmatter + Markdown body

Incident File Format

---
title: "Cloudflare Error 1033 on notes.hoffdesk.com"
date: "2026-04-23T01:55:33Z"
systems:
  - cloudflared
  - uvicorn
  - systemd
  - Cloudflare Tunnel
error: "Error 1033: Cloudflare can't reach origin"
resolved: true
tags:
  - cloudflare
  - systemd
  - tunnel
  - infrastructure
cost:
  time_hours: 2.5
  sleep_lost: true
related:
  - https://notes.hoffdesk.com/admin
---

# Cloudflare Error 1033 on notes.hoffdesk.com

**Systems:** cloudflared, uvicorn, systemd, Cloudflare Tunnel
**Error:** `Error 1033: Cloudflare can't reach origin`

## Attempts

### Attempt 1: Checked uvicorn health with curl

Result: Running but bound to 127.0.0.1 — Cloudflared on same machine could not reach it

### Attempt 2: Added systemd drop-in to bind uvicorn to 0.0.0.0:8000

Result: Local health passed, still Error 1033 — cloudflared in managed mode

### Attempt 3: Checked cloudflared service file for --config path

Result: Found correct command BUT override.conf with old syntax was winning

## The Fix

Removed stale override.conf, reloaded systemd, restarted cloudflared with local config.
notes.hoffdesk.com came back. Caveat: still hybrid local+managed, needs full migration.

## Reflection

Should have run systemctl cat first. I knew about drop-ins but forgot under pressure.
Error 1033 is generic — you peel layers like an onion.

## Cost

- Time: 2.5 hours
- Sleep: Lost

API

Python

from incidents import incident, list_incidents, to_brief

# Capture incident
filepath = incident(
    title="Cloudflare Error 1033",
    systems=["cloudflared", "uvicorn"],
    error="Error 1033",
    attempts=[
        {"what": "Checked uvicorn", "result": "Bound to 127.0.0.1"},
        {"what": "Added drop-in", "result": "Still Error 1033"},
    ],
    fix="Removed override.conf",
    reflection="Should have run systemctl cat first",
    cost={"time_hours": 2.5, "sleep_lost": True}
)

# List incidents
for inc in list_incidents():
    print(f"{inc['date']}: {inc['title']}")

# Convert to brief for pipeline
brief = to_brief("2026-04-23-cloudflare-error-1033")

CLI

# List all incidents
python incidents.py list

# Get specific incident
python incidents.py get 2026-04-23-cloudflare-error-1033

# Convert to brief
python incidents.py to-brief 2026-04-23-cloudflare-error-1033

Integration with Pipeline

from incidents import to_brief
from blog.generation.prompts import build_struggle_first_prompt
from content.pipeline import generate_ollama
from content.compliance_filter import ComplianceFilter

# Load real incident
brief = to_brief("2026-04-23-cloudflare-error-1033")

# Build prompt
prompt = build_struggle_first_prompt(brief)

# Generate
draft = await generate_ollama(model="phi4:14b", prompt=prompt)

# Compliance check
report = ComplianceFilter(strict_mode=True).process(draft)
if not report.is_compliant:
    print("Issues:", report.names_found, report.dates_found)

Compliance Filter (Strict Mode)

# STRICT: Blocks on names AND dates
filter = ComplianceFilter(strict_mode=True)
report = filter.process(draft)

# LENIENT: Only blocks on names, warns on dates
filter = ComplianceFilter(strict_mode=False)
report = filter.process(draft)

Future Enhancements

  1. Auto-capture: Hook into error handlers to auto-log incidents
  2. Telegram integration: /logincident bot command
  3. Tag suggestions: Auto-suggest tags based on systems involved
  4. Related incidents: Link related struggles for series posts
  5. Cost tracking: Aggregate time lost by system/tag

Files

  • incidents/__init__.py — Core library
  • incidents/ — Incident storage directory
  • content/compliance_filter.py — Compliance checking
  • test_pipeline_v2_real.py — Integration test