# Image Generation Pipeline Research

**Status:** Investigating  
**Date:** 2026-04-21  
**Goal:** Local image generation for HoffDesk blog articles

---

## Executive Summary

Exploring options for automated hero images, thumbnails, and article visuals using the Gaming PC's 3080 Ti (12GB VRAM). Target: sovereign-first pipeline that doesn't depend on external APIs for standard content.

---

## Hardware Constraints

| Resource | Available |
|----------|-----------|
| GPU | RTX 3080 Ti (12GB VRAM) |
| CPU | Ryzen 9 5900X |
| RAM | 64GB DDR4 |
| Storage | 2TB NVMe (local) |

**Target:** All inference runs locally. Cloud only for overflow/fallback.

---

## Model Evaluation

### Tier 1: Production Candidates

#### FLUX.1 [dev]
| Attribute | Assessment |
|-----------|------------|
| **Quality** | Excellent — photorealistic, coherent typography |
| **Speed** | ~30-60s per 1024×1024 image |
| **VRAM** | ~10GB at 1024×1024 |
| **License** | Open (non-commercial OK, commercial needs review) |
| **Use Case** | Hero images, featured posts, high-visibility content |
| **Pros** | Best-in-class quality, text rendering, composition |
| **Cons** | Slower, memory-intensive, occasional NSFW false positives |

**Installation:**
```bash
# ComfyUI + FLUX workflow
# Requires fp8 or quantized variant for 12GB
# Recommended: flux1-dev-bnb-nf4 (4-bit quantized)
```

#### SDXL Turbo
| Attribute | Assessment |
|-----------|------------|
| **Quality** | Good — fast, decent composition |
| **Speed** | ~2-5s per 512×512 image |
| **VRAM** | ~6GB at 512×512 |
| **License** | Permissive (Stability AI) |
| **Use Case** | Thumbnails, rapid iteration, batch generation |
| **Pros** | Fast, low VRAM, good for experimentation |
| **Cons** | Less detailed, struggles with complex prompts |

**Installation:**
```bash
# Automatic1111 or ComfyUI
# Model: sdxl-turbo-1.0
# Steps: 1-4 (yes, really)
```

### Tier 2: Alternatives

#### Stable Diffusion 3
- **Status:** Wait for optimized local release
- **Pros:** Better prompt adherence than SDXL
- **Cons:** Heavier than FLUX, licensing unclear

#### Stable Diffusion XL Base + Refiner
- **Status:** Deprecated by Turbo and FLUX
- **Note:** Keep as fallback if Turbo fails

### Tier 3: Cloud Fallback

#### DALL-E 3 (OpenAI)
- **Use:** Complex compositions where local models fail
- **Cost:** ~$0.04-0.08/image
- **Constraint:** Only for overflow, not default

#### Midjourney
- **Use:** High-artistry needs
- **Cost:** $10-30/month
- **Constraint:** Discord workflow not automatable

---

## Pipeline Architecture

### Option A: ComfyUI API Server (Recommended)

```
Content Pipeline → ComfyUI API (Gaming PC) → Generated Images
                       ↓
              Store in /shared/assets/images/
                       ↓
              Git LFS or R2 for production
```

**Pros:**
- Node-based, highly configurable
- Queue system for batch jobs
- REST API for integration
- FLUX + SDXL in one workflow

**Cons:**
- More complex setup
- Higher memory footprint

### Option B: Ollama-style Local API

```
Content Pipeline → Local diffusion API → Generated Images
```

**Pros:**
- Simpler integration (like existing LLM setup)
- Lower overhead

**Cons:**
- Fewer features than ComfyUI
- May need custom wrapper

---

## Style Guide Considerations

### Visual Identity Questions

1. **Color Palette:** Match HoffDesk dark mode (slate/rose/indigo)?
2. **Typography:** Generate images with readable text, or overlay in CSS?
3. **Consistency:** Same seed/style for series posts?
4. **Art Direction:** Photorealistic, abstract, or illustration?

### Proposed Direction

| Content Type | Model | Style |
|--------------|-------|-------|
| Technical deep-dives | FLUX | Photorealistic hardware, clean compositions |
| Tutorials | SDXL Turbo | Clean, bright, diagrammatic |
| Personal stories | FLUX | Atmospheric, mood-matched to narrative |
| Quick tips | SDXL Turbo | Simple, iconic, fast generation |

---

## Storage & Delivery

### Options

| Storage | Pros | Cons |
|---------|------|------|
| Local (Gaming PC) | Free, fast, sovereign | Single point of failure |
| Cloudflare R2 | Cheap, CDN-integrated, versioned | External dependency |
| Git LFS | Versioned with content | Repo bloat, slow clones |
| S3/MinIO | Industry standard, flexible | Cost, complexity |

**Recommendation:** R2 for production assets, local cache for generation pipeline.

---

## Workflow Integration

### Magic Wand UI Enhancement

```markdown
## Content Generation Prompt

**Title:** [User input]
**Tone:** [dropdown]
**Include Hero Image:** [checkbox] ← NEW
**Image Style:** [dropdown: Auto / Technical / Atmospheric / Minimal]

[Generate] → LLM writes article + FLUX generates hero image
```

### Batch Generation

```python
# Pseudocode for pipeline
for post in scheduled_posts:
    if post.needs_hero_image:
        prompt = generate_image_prompt(post.content_summary)
        image = flux_generate(prompt, size="1200x630")
        post.attach_hero(image)
        upload_to_r2(image)
```

---

## Next Steps

| Priority | Task | Owner |
|----------|------|-------|
| P1 | Install ComfyUI on Gaming PC with FLUX fp8 | Socrates |
| P2 | Test SDXL Turbo for thumbnail speed | Socrates |
| P3 | Define style guide (colors, composition, typography) | Daedalus |
| P4 | Build ComfyUI → Pipeline integration | Socrates |
| P5 | R2 bucket setup + CDN configuration | Socrates |
| P6 | Batch generation test (10 articles) | Wadsworth |

---

## Open Questions

1. **Cost modeling:** Local electricity vs cloud API costs?
2. **Prompt engineering:** Who maintains prompt library for consistency?
3. **Review workflow:** Manual approval or fully automated?
4. **Accessibility:** Alt text generation (local vision model)?
5. **Fallback:** When do we fail over to DALL-E vs retry local?

---

## Resources

- [FLUX.1 Documentation](https://huggingface.co/black-forest-labs)
- [ComfyUI Examples](https://comfyanonymous.github.io/ComfyUI_examples/)
- [SDXL Turbo Paper](https://huggingface.co/papers/2311.17042)
- [HoffDesk Design Tokens](/shared/design-tokens/sprint-1.md)

---

**Status:** `investigating` — awaiting Gaming PC setup confirmation