📄 blog-backend-architecture.md 14,177 bytes Apr 20, 2026 📋 Raw

Blog Backend Architecture — Socrates 🧠

Author: Socrates
Date: 2026-04-20
Status: Draft — Pending Daedalus review & Director approval

Design Goals

Fast & SEO-friendly — Blog posts must load fast and be indexable by search engines
Fits the existing stack — FastAPI backend, Cloudflare Pages/Tunnel, HTMX + Tailwind frontend
Simple content workflow — Markdown files in git, no heavy CMS
Zero-trust admin — Write operations behind Cloudflare Access
Progressive enhancement — Works without JS (HTMX), flies with it

Architecture Decision: Hybrid Static + Dynamic

Decision: Use a hybrid approach — static HTML for published blog posts (SEO-critical), dynamic API for listing/admin/drafts.

How it works:

┌─────────────────────────────────────────────────────┐
│                   hoffdesk.com                       │
│                                                      │
│  /blog/              → Static index (Cloudflare Pages)│
│  /blog/{slug}/       → Static post page (CF Pages)   │
│  /blog/category/{cat} → Static category page          │
│                                                      │
│  /admin/blog/        → Dynamic HTMX admin (CF Tunnel)│
│  /api/blog/*         → REST API (CF Tunnel)           │
└─────────────────────────────────────────────────────┘

Rationale:

Approach	SEO	Speed	Complexity	Verdict
Pure SSR (FastAPI renders HTML)	Medium	Medium	Low	❌ No static caching at edge
Pure SPA (JS client)	Poor	Medium	High	❌ Anti-pattern for our stack
SSG (static site gen)	Excellent	Excellent	Medium	⚠️ Needs build step, no drafts
Hybrid (SSG + Dynamic API)	Excellent	Excellent	Medium	✅ Winner

Published posts get pre-rendered to static HTML and deployed to Cloudflare Pages. The admin interface (create/edit/publish) is a dynamic HTMX app served through the Cloudflare Tunnel, gated by CF Access.

Drafts are only accessible through the admin interface (dynamic, authenticated). Published posts are publicly accessible static pages (edge-cached, fast, SEO-perfect).

Content Storage

Markdown Files + SQLite Metadata

Posts live as Markdown files in git — this is the source of truth for content.

data/blog/
  posts/
    {slug}/
      index.md        ← Markdown content (frontmatter + body)
      images/          ← Post-specific images (optional)
  blog.db             ← SQLite: metadata, search, relationships

Why Markdown + Git?

Version control — every edit is tracked. Rollback is git revert.
No CMS lock-in — Markdown is portable. If we ever move to Astro/Hugo/whatever, content migrates trivially.
Offline authoring — write in any editor, push when ready.
Agent-friendly — Wadsworth/Socrates/Daedalus can author posts directly in Markdown.
Human-readable — Matt can review diffs in git, no database inspection needed.

Why SQLite for Metadata?

Fast queries — listing, filtering, pagination, full-text search on metadata.
No external dependency — file-based, already on Beelink.
Single source for computed fields — read counts, last-modified timestamps, published state.
Rebuildable from Markdown — SQLite is a cache/derived index. If it corrupts, blog rebuild regenerates from Markdown files.

Markdown Frontmatter Schema

---
title: "Building a Multi-Agent Home Dashboard"
slug: "building-multi-agent-home-dashboard"
category: "engineering"
tags: ["openclaw", "fastapi", "home-automation"]
author: "Socrates"
status: "draft"          # draft | published | archived
published_at: null        # ISO 8601, null for drafts
created_at: "2026-04-20T12:00:00-05:00"
updated_at: "2026-04-20T14:30:00-05:00"
excerpt: "How we built a family dashboard with AI agents..."
cover_image: null         # path or URL
featured: false
---

Blog post content goes here...

## Section Title

Regular Markdown...

SQLite Schema

CREATE TABLE posts (
    id            INTEGER PRIMARY KEY AUTOINCREMENT,
    slug          TEXT NOT NULL UNIQUE,
    title         TEXT NOT NULL,
    category      TEXT NOT NULL DEFAULT 'general',
    author        TEXT NOT NULL DEFAULT 'HoffDesk Team',
    status        TEXT NOT NULL DEFAULT 'draft' CHECK(status IN ('draft','published','archived')),
    excerpt       TEXT,
    cover_image   TEXT,
    featured      INTEGER NOT NULL DEFAULT 0,  -- boolean
    published_at  TEXT,                          -- ISO 8601, NULL for drafts
    created_at    TEXT NOT NULL DEFAULT (datetime('now')),
    updated_at    TEXT NOT NULL DEFAULT (datetime('now')),
    content_path  TEXT NOT NULL,                  -- relative path to index.md
    word_count    INTEGER,
    reading_time  INTEGER                         -- estimated minutes
);

CREATE TABLE tags (
    id    INTEGER PRIMARY KEY AUTOINCREMENT,
    name  TEXT NOT NULL UNIQUE
);

CREATE TABLE post_tags (
    post_id  INTEGER NOT NULL REFERENCES posts(id) ON DELETE CASCADE,
    tag_id   INTEGER NOT NULL REFERENCES tags(id) ON DELETE CASCADE,
    PRIMARY KEY (post_id, tag_id)
);

-- Full-text search on title + excerpt + content
CREATE VIRTUAL TABLE posts_fts USING fts5(
    slug, title, excerpt, content,
    content=posts,
    content_rowid=id
);

CREATE INDEX idx_posts_status    ON posts(status);
CREATE INDEX idx_posts_category  ON posts(category);
CREATE INDEX idx_posts_published ON posts(published_at DESC);
CREATE INDEX idx_posts_featured  ON posts(featured) WHERE featured = 1;

API Endpoints

Public API (unauthenticated, read-only)

These endpoints serve the blog frontend and are cached at the Cloudflare edge.

Method	Path	Description
`GET`	`/api/blog/posts`	List published posts (paginated, filterable)
`GET`	`/api/blog/posts/{slug}`	Get a single published post (full content)
`GET`	`/api/blog/categories`	List all categories with post counts
`GET`	`/api/blog/tags`	List all tags with post counts
`GET`	`/api/blog/posts/{slug}/related`	Get related posts

Admin API (authenticated via Cloudflare Access)

These endpoints are behind CF Access (Email OTP). The admin HTMX interface uses these directly.

Method	Path	Description
`POST`	`/api/blog/posts`	Create a new post (draft or published)
`PATCH`	`/api/blog/posts/{slug}`	Update a post (partial update)
`DELETE`	`/api/blog/posts/{slug}`	Delete a post (soft-delete: status → archived)
`POST`	`/api/blog/posts/{slug}/publish`	Publish a draft
`POST`	`/api/blog/posts/{slug}/unpublish`	Unpublish (revert to draft)
`POST`	`/api/blog/posts/{slug}/regenerate`	Regenerate static HTML for a post

Build/Admin Commands (CLI, not API)

These are server-side operations triggered by the admin or git hooks.

Command	Description
`blog rebuild`	Regenerate SQLite from Markdown + regenerate all static HTML
`blog generate {slug}`	Generate static HTML for a single post
`blog generate --all`	Regenerate all published post HTML
`blog deploy`	Push generated static files to Cloudflare Pages

Static Generation Pipeline

Markdown files  ──→  blog build  ──→  Static HTML + JSON
      │                                    │
      │                                    ├→ Cloudflare Pages (public)
      │                                    └→ Blog index JSON (for API)
      │
      └──→  SQLite metadata  ──→  API queries

Build Process

Parse all Markdown files in data/blog/posts/
Extract frontmatter → upsert into SQLite
Render published posts to HTML using Jinja2 templates
Generate index pages (all posts, by category, by tag)
Generate JSON manifests (post list, sitemap, RSS feed)
Write static output to data/blog/dist/
Deploy to Cloudflare Pages via git push or Wrangler

When does rebuild happen?

Trigger	Action
`POST /api/blog/posts` (create)	Generate new post + update index
`PATCH /api/blog/posts/{slug}` (update published post)	Regenerate that post + index
`POST /api/blog/posts/{slug}/publish`	Generate post HTML + update index
`DELETE /api/blog/posts/{slug}`	Remove post HTML + update index
`git push` to blog content branch	Full rebuild via CI hook
Manual `blog rebuild` CLI command	Full rebuild

Static HTML Template Structure

dist/
  blog/
    index.html                    ← Blog home (all posts)
    {slug}/
      index.html                   ← Individual post
    category/
      {category}/index.html       ← Category archive
    tag/
      {tag}/index.html             ← Tag archive
    feed.xml                      ← RSS feed
    sitemap.xml                   ← SEO sitemap
    api/
      posts.json                  ← Post listing JSON (for HTMX)
      posts/{slug}.json           ← Individual post JSON

Integration with Existing Stack

FastAPI Integration

Blog routes are a sub-application mounted on the existing FastAPI app:

# In main FastAPI app
from blog.router import blog_router

app.include_router(
    blog_router,
    prefix="/api/blog",
    tags=["blog"]
)

The blog module is self-contained:
- blog/router.py — FastAPI routes
- blog/models.py — Pydantic models
- blog/service.py — Business logic (CRUD, publish, generate)
- blog/storage.py — SQLite + Markdown I/O
- blog/builder.py — Static HTML generation
- blog/templates/ — Jinja2 templates for HTML generation

Authentication Boundary

Public (no auth):
  hoffdesk.com/blog/*          → Cloudflare Pages (static)
  api.hoffdesk.com/api/blog/posts/* → FastAPI read endpoints

Authenticated (CF Access Email OTP):
  api.hoffdesk.com/admin/blog/* → HTMX admin interface
  api.hoffdesk.com/api/blog/admin/* → Write endpoints

CF Access policies enforce authentication at the Cloudflare edge before requests reach Beelink. The FastAPI app doesn't need to handle auth — it trusts that requests to /admin/* and /api/blog/admin/* have been authenticated by Cloudflare.

Webhook Integration

When a post is published or updated:

FastAPI generates static files locally
Commits to the blog's git repo (or copies to the Pages project)
Optionally triggers a Cloudflare Pages deploy via webhook at hook.hoffdesk.com
Cache purge request to Cloudflare API (if needed)

Caching Strategy

Layer	TTL	Purge Trigger
Cloudflare Edge (static HTML)	1 year	On deploy (cache purge)
Cloudflare Edge (API JSON)	5 min	On publish/update
FastAPI in-memory (SQLite queries)	60s	On write operations
Browser (HTMX polling)	N/A	Polling every 60s for admin

File Structure (in Repo)

hoffdesk-api/
  blog/
    __init__.py
    router.py          ← FastAPI routes
    models.py           ← Pydantic request/response models
    service.py          ← Business logic
    storage.py           ← SQLite + Markdown I/O
    builder.py           ← Static site generation
    templates/
      post.html.j2       ← Jinja2 template for individual posts
      index.html.j2      ← Blog index page
      category.html.j2   ← Category archive
      tag.html.j2         ← Tag archive
      feed.xml.j2         ← RSS feed template
    static/               ← Static assets (CSS, JS) — symlinked from design tokens
data/
  blog/
    posts/                ← Markdown source files (git-tracked)
    blog.db               ← SQLite metadata (gitignored, rebuilt from MD)
    dist/                 ← Generated static HTML (gitignored, deployed to CF Pages)

Security Considerations

Admin routes behind CF Access — No auth logic in FastAPI. Cloudflare handles it.
No raw HTML in Markdown — Sanitize rendered Markdown (bleach/markdown-it with HTML disabled).
SQL injection — Use parameterized queries exclusively (SQLAlchemy Core or raw with ? placeholders).
Rate limiting — Cloudflare rate limiting on public API endpoints.
No file uploads via API — Images go into data/blog/posts/{slug}/images/ via git push or direct server placement. No upload endpoint.
Slug validation — Alphanumeric + hyphens only, max 255 chars, reserved slugs (api, admin, category, tag, feed).

Migration Path (Future)

This architecture is deliberately simple. If HoffDesk outgrows it:

From	To	Effort
Markdown files	Database-backed CMS	Medium — migrate content, update storage layer
SQLite	PostgreSQL	Low — SQLAlchemy abstracts this
Static generation	Server-side rendering	Medium — swap builder for Jinja2 templates served dynamically
Cloudflare Pages	Vercel/Netlify	Low — static files deploy anywhere

The key insight: content is Markdown, metadata is SQLite, output is static HTML. Each layer can be swapped independently.

Resolved Design Decisions

Post page layout → Full HTML pages (SEO-primary, static HTML at edge)
Category/tag archive → Card grid (2-col desktop, 1-col mobile, matches post index)
Admin UI → Functional-minimal first (Socrates builds plumbing, Daedalus skins in Sprint 2)
Image handling → Simple <img> for MVP (<picture>/srcset deferred to Sprint 2)
RSS feed → Full content, RSS 2.0 (wider support, matches content strategy)

This document lives at shared/project-docs/blog/blog-backend-architecture.md
Socrates 🧠

← Back