📄 blog-backend-architecture.md 14,177 bytes Apr 20, 2026 📋 Raw

Blog Backend Architecture — Socrates 🧠

Author: Socrates
Date: 2026-04-20
Status: Draft — Pending Daedalus review & Director approval


Design Goals

  1. Fast & SEO-friendly — Blog posts must load fast and be indexable by search engines
  2. Fits the existing stack — FastAPI backend, Cloudflare Pages/Tunnel, HTMX + Tailwind frontend
  3. Simple content workflow — Markdown files in git, no heavy CMS
  4. Zero-trust admin — Write operations behind Cloudflare Access
  5. Progressive enhancement — Works without JS (HTMX), flies with it

Architecture Decision: Hybrid Static + Dynamic

Decision: Use a hybrid approach — static HTML for published blog posts (SEO-critical), dynamic API for listing/admin/drafts.

How it works:

┌─────────────────────────────────────────────────────┐
│                   hoffdesk.com                       │
│                                                      │
│  /blog/              → Static index (Cloudflare Pages)│
│  /blog/{slug}/       → Static post page (CF Pages)   │
│  /blog/category/{cat} → Static category page          │
│                                                      │
│  /admin/blog/        → Dynamic HTMX admin (CF Tunnel)│
│  /api/blog/*         → REST API (CF Tunnel)           │
└─────────────────────────────────────────────────────┘

Rationale:

Approach SEO Speed Complexity Verdict
Pure SSR (FastAPI renders HTML) Medium Medium Low ❌ No static caching at edge
Pure SPA (JS client) Poor Medium High ❌ Anti-pattern for our stack
SSG (static site gen) Excellent Excellent Medium ⚠️ Needs build step, no drafts
Hybrid (SSG + Dynamic API) Excellent Excellent Medium Winner

Published posts get pre-rendered to static HTML and deployed to Cloudflare Pages. The admin interface (create/edit/publish) is a dynamic HTMX app served through the Cloudflare Tunnel, gated by CF Access.

Drafts are only accessible through the admin interface (dynamic, authenticated). Published posts are publicly accessible static pages (edge-cached, fast, SEO-perfect).


Content Storage

Markdown Files + SQLite Metadata

Posts live as Markdown files in git — this is the source of truth for content.

data/blog/
  posts/
    {slug}/
      index.md        ← Markdown content (frontmatter + body)
      images/          ← Post-specific images (optional)
  blog.db             ← SQLite: metadata, search, relationships

Why Markdown + Git?

  • Version control — every edit is tracked. Rollback is git revert.
  • No CMS lock-in — Markdown is portable. If we ever move to Astro/Hugo/whatever, content migrates trivially.
  • Offline authoring — write in any editor, push when ready.
  • Agent-friendly — Wadsworth/Socrates/Daedalus can author posts directly in Markdown.
  • Human-readable — Matt can review diffs in git, no database inspection needed.

Why SQLite for Metadata?

  • Fast queries — listing, filtering, pagination, full-text search on metadata.
  • No external dependency — file-based, already on Beelink.
  • Single source for computed fields — read counts, last-modified timestamps, published state.
  • Rebuildable from Markdown — SQLite is a cache/derived index. If it corrupts, blog rebuild regenerates from Markdown files.

Markdown Frontmatter Schema

---
title: "Building a Multi-Agent Home Dashboard"
slug: "building-multi-agent-home-dashboard"
category: "engineering"
tags: ["openclaw", "fastapi", "home-automation"]
author: "Socrates"
status: "draft"          # draft | published | archived
published_at: null        # ISO 8601, null for drafts
created_at: "2026-04-20T12:00:00-05:00"
updated_at: "2026-04-20T14:30:00-05:00"
excerpt: "How we built a family dashboard with AI agents..."
cover_image: null         # path or URL
featured: false
---

Blog post content goes here...

## Section Title

Regular Markdown...

SQLite Schema

CREATE TABLE posts (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    slug          TEXT NOT NULL UNIQUE,
    title         TEXT NOT NULL,
    category      TEXT NOT NULL DEFAULT 'general',
    author        TEXT NOT NULL DEFAULT 'HoffDesk Team',
    status        TEXT NOT NULL DEFAULT 'draft' CHECK(status IN ('draft','published','archived')),
    excerpt       TEXT,
    cover_image   TEXT,
    featured      INTEGER NOT NULL DEFAULT 0,  -- boolean
    published_at  TEXT,                          -- ISO 8601, NULL for drafts
    created_at    TEXT NOT NULL DEFAULT (datetime('now')),
    updated_at    TEXT NOT NULL DEFAULT (datetime('now')),
    content_path  TEXT NOT NULL,                  -- relative path to index.md
    word_count    INTEGER,
    reading_time  INTEGER                         -- estimated minutes
);

CREATE TABLE tags (
    id    INTEGER PRIMARY KEY AUTOINCREMENT,
    name  TEXT NOT NULL UNIQUE
);

CREATE TABLE post_tags (
    post_id  INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    tag_id   INTEGER NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (post_id, tag_id)
);

-- Full-text search on title + excerpt + content
CREATE VIRTUAL TABLE posts_fts USING fts5(
    slug, title, excerpt, content,
    content=posts,
    content_rowid=id
);

CREATE INDEX idx_posts_status    ON posts(status);
CREATE INDEX idx_posts_category  ON posts(category);
CREATE INDEX idx_posts_published ON posts(published_at DESC);
CREATE INDEX idx_posts_featured  ON posts(featured) WHERE featured = 1;

API Endpoints

Public API (unauthenticated, read-only)

These endpoints serve the blog frontend and are cached at the Cloudflare edge.

Method Path Description
GET /api/blog/posts List published posts (paginated, filterable)
GET /api/blog/posts/{slug} Get a single published post (full content)
GET /api/blog/categories List all categories with post counts
GET /api/blog/tags List all tags with post counts
GET /api/blog/posts/{slug}/related Get related posts

Admin API (authenticated via Cloudflare Access)

These endpoints are behind CF Access (Email OTP). The admin HTMX interface uses these directly.

Method Path Description
POST /api/blog/posts Create a new post (draft or published)
PATCH /api/blog/posts/{slug} Update a post (partial update)
DELETE /api/blog/posts/{slug} Delete a post (soft-delete: status → archived)
POST /api/blog/posts/{slug}/publish Publish a draft
POST /api/blog/posts/{slug}/unpublish Unpublish (revert to draft)
POST /api/blog/posts/{slug}/regenerate Regenerate static HTML for a post

Build/Admin Commands (CLI, not API)

These are server-side operations triggered by the admin or git hooks.

Command Description
blog rebuild Regenerate SQLite from Markdown + regenerate all static HTML
blog generate {slug} Generate static HTML for a single post
blog generate --all Regenerate all published post HTML
blog deploy Push generated static files to Cloudflare Pages

Static Generation Pipeline

Markdown files  ──→  blog build  ──→  Static HTML + JSON
                                                                                    ├→ Cloudflare Pages (public)
                                          └→ Blog index JSON (for API)
            └──→  SQLite metadata  ──→  API queries

Build Process

  1. Parse all Markdown files in data/blog/posts/
  2. Extract frontmatter → upsert into SQLite
  3. Render published posts to HTML using Jinja2 templates
  4. Generate index pages (all posts, by category, by tag)
  5. Generate JSON manifests (post list, sitemap, RSS feed)
  6. Write static output to data/blog/dist/
  7. Deploy to Cloudflare Pages via git push or Wrangler

When does rebuild happen?

Trigger Action
POST /api/blog/posts (create) Generate new post + update index
PATCH /api/blog/posts/{slug} (update published post) Regenerate that post + index
POST /api/blog/posts/{slug}/publish Generate post HTML + update index
DELETE /api/blog/posts/{slug} Remove post HTML + update index
git push to blog content branch Full rebuild via CI hook
Manual blog rebuild CLI command Full rebuild

Static HTML Template Structure

dist/
  blog/
    index.html                    ← Blog home (all posts)
    {slug}/
      index.html                   ← Individual post
    category/
      {category}/index.html       ← Category archive
    tag/
      {tag}/index.html             ← Tag archive
    feed.xml                      ← RSS feed
    sitemap.xml                   ← SEO sitemap
    api/
      posts.json                  ← Post listing JSON (for HTMX)
      posts/{slug}.json           ← Individual post JSON

Integration with Existing Stack

FastAPI Integration

Blog routes are a sub-application mounted on the existing FastAPI app:

# In main FastAPI app
from blog.router import blog_router

app.include_router(
    blog_router,
    prefix="/api/blog",
    tags=["blog"]
)

The blog module is self-contained:
- blog/router.py — FastAPI routes
- blog/models.py — Pydantic models
- blog/service.py — Business logic (CRUD, publish, generate)
- blog/storage.py — SQLite + Markdown I/O
- blog/builder.py — Static HTML generation
- blog/templates/ — Jinja2 templates for HTML generation

Authentication Boundary

Public (no auth):
  hoffdesk.com/blog/*          → Cloudflare Pages (static)
  api.hoffdesk.com/api/blog/posts/* → FastAPI read endpoints

Authenticated (CF Access Email OTP):
  api.hoffdesk.com/admin/blog/* → HTMX admin interface
  api.hoffdesk.com/api/blog/admin/* → Write endpoints

CF Access policies enforce authentication at the Cloudflare edge before requests reach Beelink. The FastAPI app doesn't need to handle auth — it trusts that requests to /admin/* and /api/blog/admin/* have been authenticated by Cloudflare.

Webhook Integration

When a post is published or updated:

  1. FastAPI generates static files locally
  2. Commits to the blog's git repo (or copies to the Pages project)
  3. Optionally triggers a Cloudflare Pages deploy via webhook at hook.hoffdesk.com
  4. Cache purge request to Cloudflare API (if needed)

Caching Strategy

Layer TTL Purge Trigger
Cloudflare Edge (static HTML) 1 year On deploy (cache purge)
Cloudflare Edge (API JSON) 5 min On publish/update
FastAPI in-memory (SQLite queries) 60s On write operations
Browser (HTMX polling) N/A Polling every 60s for admin

File Structure (in Repo)

hoffdesk-api/
  blog/
    __init__.py
    router.py          ← FastAPI routes
    models.py           ← Pydantic request/response models
    service.py          ← Business logic
    storage.py           ← SQLite + Markdown I/O
    builder.py           ← Static site generation
    templates/
      post.html.j2       ← Jinja2 template for individual posts
      index.html.j2      ← Blog index page
      category.html.j2   ← Category archive
      tag.html.j2         ← Tag archive
      feed.xml.j2         ← RSS feed template
    static/               ← Static assets (CSS, JS) — symlinked from design tokens
data/
  blog/
    posts/                ← Markdown source files (git-tracked)
    blog.db               ← SQLite metadata (gitignored, rebuilt from MD)
    dist/                 ← Generated static HTML (gitignored, deployed to CF Pages)

Security Considerations

  1. Admin routes behind CF Access — No auth logic in FastAPI. Cloudflare handles it.
  2. No raw HTML in Markdown — Sanitize rendered Markdown (bleach/markdown-it with HTML disabled).
  3. SQL injection — Use parameterized queries exclusively (SQLAlchemy Core or raw with ? placeholders).
  4. Rate limiting — Cloudflare rate limiting on public API endpoints.
  5. No file uploads via API — Images go into data/blog/posts/{slug}/images/ via git push or direct server placement. No upload endpoint.
  6. Slug validation — Alphanumeric + hyphens only, max 255 chars, reserved slugs (api, admin, category, tag, feed).

Migration Path (Future)

This architecture is deliberately simple. If HoffDesk outgrows it:

From To Effort
Markdown files Database-backed CMS Medium — migrate content, update storage layer
SQLite PostgreSQL Low — SQLAlchemy abstracts this
Static generation Server-side rendering Medium — swap builder for Jinja2 templates served dynamically
Cloudflare Pages Vercel/Netlify Low — static files deploy anywhere

The key insight: content is Markdown, metadata is SQLite, output is static HTML. Each layer can be swapped independently.


Resolved Design Decisions

  1. Post page layoutFull HTML pages (SEO-primary, static HTML at edge)
  2. Category/tag archiveCard grid (2-col desktop, 1-col mobile, matches post index)
  3. Admin UIFunctional-minimal first (Socrates builds plumbing, Daedalus skins in Sprint 2)
  4. Image handlingSimple <img> for MVP (<picture>/srcset deferred to Sprint 2)
  5. RSS feedFull content, RSS 2.0 (wider support, matches content strategy)

This document lives at shared/project-docs/blog/blog-backend-architecture.md
Socrates 🧠