📄 image-generation-pipeline.md 6,181 bytes Apr 21, 2026 📋 Raw

Image Generation Pipeline Research

Status: Investigating
Date: 2026-04-21
Goal: Local image generation for HoffDesk blog articles


Executive Summary

Exploring options for automated hero images, thumbnails, and article visuals using the Gaming PC's 3080 Ti (12GB VRAM). Target: sovereign-first pipeline that doesn't depend on external APIs for standard content.


Hardware Constraints

Resource Available
GPU RTX 3080 Ti (12GB VRAM)
CPU Ryzen 9 5900X
RAM 64GB DDR4
Storage 2TB NVMe (local)

Target: All inference runs locally. Cloud only for overflow/fallback.


Model Evaluation

Tier 1: Production Candidates

FLUX.1 [dev]

Attribute Assessment
Quality Excellent — photorealistic, coherent typography
Speed ~30-60s per 1024×1024 image
VRAM ~10GB at 1024×1024
License Open (non-commercial OK, commercial needs review)
Use Case Hero images, featured posts, high-visibility content
Pros Best-in-class quality, text rendering, composition
Cons Slower, memory-intensive, occasional NSFW false positives

Installation:

# ComfyUI + FLUX workflow
# Requires fp8 or quantized variant for 12GB
# Recommended: flux1-dev-bnb-nf4 (4-bit quantized)

SDXL Turbo

Attribute Assessment
Quality Good — fast, decent composition
Speed ~2-5s per 512×512 image
VRAM ~6GB at 512×512
License Permissive (Stability AI)
Use Case Thumbnails, rapid iteration, batch generation
Pros Fast, low VRAM, good for experimentation
Cons Less detailed, struggles with complex prompts

Installation:

# Automatic1111 or ComfyUI
# Model: sdxl-turbo-1.0
# Steps: 1-4 (yes, really)

Tier 2: Alternatives

Stable Diffusion 3

  • Status: Wait for optimized local release
  • Pros: Better prompt adherence than SDXL
  • Cons: Heavier than FLUX, licensing unclear

Stable Diffusion XL Base + Refiner

  • Status: Deprecated by Turbo and FLUX
  • Note: Keep as fallback if Turbo fails

Tier 3: Cloud Fallback

DALL-E 3 (OpenAI)

  • Use: Complex compositions where local models fail
  • Cost: ~$0.04-0.08/image
  • Constraint: Only for overflow, not default

Midjourney

  • Use: High-artistry needs
  • Cost: $10-30/month
  • Constraint: Discord workflow not automatable

Pipeline Architecture

Content Pipeline  ComfyUI API (Gaming PC)  Generated Images
                                     Store in /shared/assets/images/
                                     Git LFS or R2 for production

Pros:
- Node-based, highly configurable
- Queue system for batch jobs
- REST API for integration
- FLUX + SDXL in one workflow

Cons:
- More complex setup
- Higher memory footprint

Option B: Ollama-style Local API

Content Pipeline → Local diffusion API → Generated Images

Pros:
- Simpler integration (like existing LLM setup)
- Lower overhead

Cons:
- Fewer features than ComfyUI
- May need custom wrapper


Style Guide Considerations

Visual Identity Questions

  1. Color Palette: Match HoffDesk dark mode (slate/rose/indigo)?
  2. Typography: Generate images with readable text, or overlay in CSS?
  3. Consistency: Same seed/style for series posts?
  4. Art Direction: Photorealistic, abstract, or illustration?

Proposed Direction

Content Type Model Style
Technical deep-dives FLUX Photorealistic hardware, clean compositions
Tutorials SDXL Turbo Clean, bright, diagrammatic
Personal stories FLUX Atmospheric, mood-matched to narrative
Quick tips SDXL Turbo Simple, iconic, fast generation

Storage & Delivery

Options

Storage Pros Cons
Local (Gaming PC) Free, fast, sovereign Single point of failure
Cloudflare R2 Cheap, CDN-integrated, versioned External dependency
Git LFS Versioned with content Repo bloat, slow clones
S3/MinIO Industry standard, flexible Cost, complexity

Recommendation: R2 for production assets, local cache for generation pipeline.


Workflow Integration

Magic Wand UI Enhancement

## Content Generation Prompt

**Title:** [User input]
**Tone:** [dropdown]
**Include Hero Image:** [checkbox] ← NEW
**Image Style:** [dropdown: Auto / Technical / Atmospheric / Minimal]

[Generate] → LLM writes article + FLUX generates hero image

Batch Generation

# Pseudocode for pipeline
for post in scheduled_posts:
    if post.needs_hero_image:
        prompt = generate_image_prompt(post.content_summary)
        image = flux_generate(prompt, size="1200x630")
        post.attach_hero(image)
        upload_to_r2(image)

Next Steps

Priority Task Owner
P1 Install ComfyUI on Gaming PC with FLUX fp8 Socrates
P2 Test SDXL Turbo for thumbnail speed Socrates
P3 Define style guide (colors, composition, typography) Daedalus
P4 Build ComfyUI → Pipeline integration Socrates
P5 R2 bucket setup + CDN configuration Socrates
P6 Batch generation test (10 articles) Wadsworth

Open Questions

  1. Cost modeling: Local electricity vs cloud API costs?
  2. Prompt engineering: Who maintains prompt library for consistency?
  3. Review workflow: Manual approval or fully automated?
  4. Accessibility: Alt text generation (local vision model)?
  5. Fallback: When do we fail over to DALL-E vs retry local?

Resources


Status: investigating — awaiting Gaming PC setup confirmation