📄 image-generation-pipeline.md 6,181 bytes Apr 21, 2026 📋 Raw

Image Generation Pipeline Research

Status: Investigating
Date: 2026-04-21
Goal: Local image generation for HoffDesk blog articles

Executive Summary

Exploring options for automated hero images, thumbnails, and article visuals using the Gaming PC's 3080 Ti (12GB VRAM). Target: sovereign-first pipeline that doesn't depend on external APIs for standard content.

Hardware Constraints

Resource	Available
GPU	RTX 3080 Ti (12GB VRAM)
CPU	Ryzen 9 5900X
RAM	64GB DDR4
Storage	2TB NVMe (local)

Target: All inference runs locally. Cloud only for overflow/fallback.

Model Evaluation

Tier 1: Production Candidates

FLUX.1 [dev]

Attribute	Assessment
Quality	Excellent — photorealistic, coherent typography
Speed	~30-60s per 1024×1024 image
VRAM	~10GB at 1024×1024
License	Open (non-commercial OK, commercial needs review)
Use Case	Hero images, featured posts, high-visibility content
Pros	Best-in-class quality, text rendering, composition
Cons	Slower, memory-intensive, occasional NSFW false positives

Installation:

# ComfyUI + FLUX workflow
# Requires fp8 or quantized variant for 12GB
# Recommended: flux1-dev-bnb-nf4 (4-bit quantized)

SDXL Turbo

Attribute	Assessment
Quality	Good — fast, decent composition
Speed	~2-5s per 512×512 image
VRAM	~6GB at 512×512
License	Permissive (Stability AI)
Use Case	Thumbnails, rapid iteration, batch generation
Pros	Fast, low VRAM, good for experimentation
Cons	Less detailed, struggles with complex prompts

Installation:

# Automatic1111 or ComfyUI
# Model: sdxl-turbo-1.0
# Steps: 1-4 (yes, really)

Tier 2: Alternatives

Stable Diffusion 3

Status: Wait for optimized local release
Pros: Better prompt adherence than SDXL
Cons: Heavier than FLUX, licensing unclear

Stable Diffusion XL Base + Refiner

Status: Deprecated by Turbo and FLUX
Note: Keep as fallback if Turbo fails

Tier 3: Cloud Fallback

DALL-E 3 (OpenAI)

Use: Complex compositions where local models fail
Cost: ~$0.04-0.08/image
Constraint: Only for overflow, not default

Midjourney

Use: High-artistry needs
Cost: $10-30/month
Constraint: Discord workflow not automatable

Pipeline Architecture

Option A: ComfyUI API Server (Recommended)

Content Pipeline → ComfyUI API (Gaming PC) → Generated Images
                       ↓
              Store in /shared/assets/images/
                       ↓
              Git LFS or R2 for production

Pros:
- Node-based, highly configurable
- Queue system for batch jobs
- REST API for integration
- FLUX + SDXL in one workflow

Cons:
- More complex setup
- Higher memory footprint

Option B: Ollama-style Local API

Content Pipeline → Local diffusion API → Generated Images

Pros:
- Simpler integration (like existing LLM setup)
- Lower overhead

Cons:
- Fewer features than ComfyUI
- May need custom wrapper

Style Guide Considerations

Visual Identity Questions

Color Palette: Match HoffDesk dark mode (slate/rose/indigo)?
Typography: Generate images with readable text, or overlay in CSS?
Consistency: Same seed/style for series posts?
Art Direction: Photorealistic, abstract, or illustration?

Proposed Direction

Content Type	Model	Style
Technical deep-dives	FLUX	Photorealistic hardware, clean compositions
Tutorials	SDXL Turbo	Clean, bright, diagrammatic
Personal stories	FLUX	Atmospheric, mood-matched to narrative
Quick tips	SDXL Turbo	Simple, iconic, fast generation

Storage & Delivery

Options

Storage	Pros	Cons
Local (Gaming PC)	Free, fast, sovereign	Single point of failure
Cloudflare R2	Cheap, CDN-integrated, versioned	External dependency
Git LFS	Versioned with content	Repo bloat, slow clones
S3/MinIO	Industry standard, flexible	Cost, complexity

Recommendation: R2 for production assets, local cache for generation pipeline.

Workflow Integration

Magic Wand UI Enhancement

## Content Generation Prompt

**Title:** [User input]
**Tone:** [dropdown]
**Include Hero Image:** [checkbox] ← NEW
**Image Style:** [dropdown: Auto / Technical / Atmospheric / Minimal]

[Generate] → LLM writes article + FLUX generates hero image

Batch Generation

# Pseudocode for pipeline
for post in scheduled_posts:
    if post.needs_hero_image:
        prompt = generate_image_prompt(post.content_summary)
        image = flux_generate(prompt, size="1200x630")
        post.attach_hero(image)
        upload_to_r2(image)

Next Steps

Priority	Task	Owner
P1	Install ComfyUI on Gaming PC with FLUX fp8	Socrates
P2	Test SDXL Turbo for thumbnail speed	Socrates
P3	Define style guide (colors, composition, typography)	Daedalus
P4	Build ComfyUI → Pipeline integration	Socrates
P5	R2 bucket setup + CDN configuration	Socrates
P6	Batch generation test (10 articles)	Wadsworth

Open Questions

Cost modeling: Local electricity vs cloud API costs?
Prompt engineering: Who maintains prompt library for consistency?
Review workflow: Manual approval or fully automated?
Accessibility: Alt text generation (local vision model)?
Fallback: When do we fail over to DALL-E vs retry local?

Resources

Status: investigating — awaiting Gaming PC setup confirmation

← Back