📄 2026-05-07-compaction-config.md 4,550 bytes Yesterday 22:57 📋 Raw

Session: 2026-05-07 22:57:43 UTC

Session Key: agent:daedalus:telegram:direct:8386527252
Session ID: 1f3eaaec-a447-4aaf-9bf7-d9560bbe3ef8
Source: telegram

Conversation Summary

assistant: Now validate and restart:
assistant: Restart didn't fully come up. Let me check why:
user: Hi- your thoughts?
assistant:

Let me check the gateway status and config:
assistant: Gateway is up and running (active since 22:26). Let me verify the config was loaded:
assistant: Good — gateway restarted and config is live. Here's what we ended up with:

1. maxActiveTranscriptBytes: 50MB — Massively inflated from 5MB. The byte-triggered compaction is essentially a safety net now; it'll only fire if a transcript file somehow balloons to 50MB (which would require ~250K tokens). Token-based compaction still fires normally at the 98K mark (128K - 30K reserve).

2. Reserve / budget bumped:
- reserveTokens: 30,000 (was 20K) — compaction triggers at 98K tokens, not 108K
- reserveTokensFloor: 40,000 (was 20K) — floor ensures it doesn't compact too late
- keepRecentTokens: 15,000 (was 10K) — more recent context survives
- recentTurnsPreserve: 8 (was 6) — two more verbatim turns kept

3. midTurnPrecheck: false — Disabled. No mid-tool-loop compaction interruptions.

4. memoryFlush.model: ollama/gemma4:8b — Explicitly set to a fast local 8B model. No heavy cloud model doing janitorial summarization anymore.

5. memoryFlush.softThresholdTokens: 10,000 — Raised from 5K to 10K so the flush doesn't fire too early.

Thoughts: The pattern I'm seeing across all our compaction debates is that we kept layering new features (byte triggers, mid-turn checks, flush models) on top of a system that was working fine with just reserveTokens. The 2026.4.15 config had none of these — just mode: "safeguard" with sensible reserve tokens. The new features (especially truncateAfterCompaction + maxActiveTranscriptBytes) create two independent compaction triggers that can fire at different times, which is where the "compacting more than expected" feeling comes from.

This config should feel different because:
- Only one real pressure valve (token threshold at 98K)
- Byte trigger moved to a 50MB moon-shot
- No mid-turn interruptions
- Flush model is a lightweight local

Let it run for a day and see if the "marginally better" feeling improves to "actually better."
user: Quick question- the local flush LLM— is that running on the Beelink? We basically can’t run models on the Beelink. Route to the gaming pc models?
assistant: Great catch. Let me check what Ollama endpoints are available and route the flush model to the gaming PC:
assistant: Let me see the full model config to find a small fast model on the gaming PC:
assistant: Perfect — so the ollama-remote provider already routes to the gaming PC (http://matt-pc.tail864e81.ts.net:11434). The models available there are: gemma4:latest, qwen2.5-coder:7b, llama3.2:3b, phi4:14b.

The fastest for a flush task would be ollama-remote/llama3.2:3b (lightest, fastest). But ollama-remote/phi4:14b or ollama-remote/qwen2.5-coder:7b would give better quality summarization while still offloading from the Beelink.

Let me update the flush model:
user: The telegram responses seem to hang
assistant:

Let me check the gateway state:
assistant: Gateway is running. The restart went through. Let me check if there are queued messages stalled:
assistant: It's processing — I can see the daedalus session is active (q=1). The 2026.5.3-1 version note in the systemd unit is curious — we're on 2026.5.4 installed but the service file still references the old version. Likely just a stale service file description, not an actual version mismatch.

The gateway is up and responding. Your previous message came through. The "hanging" you saw was likely the gateway restarting (it did a full systemd restart which takes ~30s). It's back up now — I'm responding.
assistant: Just the service description label — it's running the installed version (2026.5.4), the label is stale.

On the flush model: I switched it from ollama/gemma4:8b (which would hit the Beelink) to ollama-remote/phi4:14b — that routes to the gaming PC at matt-pc.tail864e81.ts.net:11434. phi4:14b is fast enough for summarization without bogging down the Beelink.

The new config is live since 22:38 UTC. Your message came through and I'm replying, so Telegram isn't hanging anymore — that was likely the restart window.

← Back