📄 brain-intelligence-spec.md 5,170 bytes Apr 30, 2026 📋 Raw

Brain Intelligence Build — Phase 4 Spec

Status: Ready for Socrates
Created: 2026-04-30 16:51 UTC
Dispatcher: Wadsworth


Objective

Wire existing RAG components into a proper /brain/query endpoint with source attribution and context assembly.

Why Now

The Family Brain (family_brain.py) has full RAG (query, answer, ingest) but lacks:
1. HTTP API endpoint (/brain/query?q=...)
2. Source attribution in "Found in: {filename}" format
3. Proper context assembly for synthesis

Matt wants to ask questions like "what do the kids need for the field trip?" and get answers with clear provenance.

Existing Components

Core Files

  • services/icarus/core/family_brain.py — Full RAG implementation
  • query() — semantic search via ChromaDB
  • answer() — retrieve + synthesize with local LLM
  • ingest_email() / ingest_newsletter() — document ingestion
  • _build_synthesis_prompt() — prompt template
  • _build_hybrid_prompt() — calendar + brain hybrid
  • services/icarus/core/embeddings.py — Ollama client
  • get_embedding() / get_embedding_async() — nomic-embed-text
  • services/icarus/core/memory/engine.py — Learned routing patterns
  • learn_assignment() / lookup_assignment() — family member routing
  • services/icarus/core/api.py — FastAPI endpoints (lines ~1-350)
  • Has /vision/*, /system/state, /event_graph/*, dashboard
  • Missing: /brain/* endpoints

Config

  • services/icarus/core/config/staging.py
  • CHROMA_DB_PATH — isolated staging data
  • OLLAMA_BASE_URL — Gaming PC via Tailscale
  • OLLAMA_EMBED_MODEL — nomic-embed-text
  • LLM_MODEL / LLM_URL — Beelink localhost

Sovereign Constraints

Constraint Implementation
Local embeddings nomic-embed-text via Gaming PC Tailscale
LLM synthesis qwen2.5-coder:7b on Beelink localhost
No cloud default Overflow/fallback only
ENV isolation staging config paths, not production
Zero costco_route imports Clean-room build

Deliverables

1. /brain/query Endpoint

Method: GET /brain/query?q={question}&top_k=3&source=email|newsletter|all

Response:

{
  "answer": "Sullivan needs a signed permission slip and $5 for the field trip to the zoo on Friday.",
  "sources": [
    {
      "attribution": "Found in: Field Trip Permission Slip, 2025-09-15",
      "subject": "Field Trip Permission Slip",
      "date": "2025-09-15",
      "score": 0.92,
      "source_type": "email"
    }
  ],
  "confidence": "high",
  "query_time_ms": 1450
}

Requirements:
- answer — conversational, not robotic (uses existing _build_synthesis_prompt())
- sources.attribution — exactly "Found in: {subject}, {date}"
- confidence — high (>0.7), medium (0.3-0.7), low (<0.3)
- query_time_ms — response timing
- Handle empty ChromaDB gracefully: "I don't see that in the emails I've processed."

2. /brain/stats Endpoint

Method: GET /brain/stats

Response:

{
  "total_documents": 47,
  "collection": "family_knowledge",
  "db_path": "/home/hoffmann_admin/.icarus/staging/chroma",
  "last_ingest": "2025-09-15T14:32:00-05:00"
}

3. Context Assembly Enhancement

The existing answer() function needs refinement:
- Deduplicate sources before synthesis (same email, multiple chunks)
- Sort by score descending
- Limit context to top 3 chunks to avoid prompt bloat
- Include source metadata in synthesis prompt for attribution

Implementation Notes

Where to Add Endpoints

In services/icarus/core/api.py, after existing /api/event_graph/* endpoints (around line 290):

# =============================================================================
# Brain Intelligence Endpoints
# =============================================================================

@app.get("/brain/query")
async def brain_query(q: str, top_k: int = 3, source: str = "all"):
    """Query the Family Brain with natural language."""
    from icarus.core.family_brain import answer, query
    # ... implementation

@app.get("/brain/stats")
async def brain_stats():
    """Get Family Brain statistics."""
    from icarus.core.family_brain import stats
    # ... implementation

Error Handling

  • ChromaDB not initialized → return empty state, not 500
  • Ollama unreachable → "Brain service temporarily unavailable"
  • Invalid source filter → ignore filter, search all

Do NOT

  • Import from costco_route
  • Use cloud embeddings by default
  • Break existing vision pipeline
  • Add new dependencies (use existing chromadb, httpx)

Do

  • Read family_brain.py thoroughly first
  • Use existing patterns from /api/* endpoints
  • Test with empty ChromaDB
  • Maintain FastAPI patterns

Success Criteria

  1. [ ] GET /brain/query?q=test returns valid JSON
  2. [ ] Source attribution uses exact "Found in: {subject}, {date}" format
  3. [ ] Confidence scoring accurate
  4. [ ] /brain/stats returns collection metadata
  5. [ ] No regression in existing endpoints

Handoff: Dispatch to Socrates 🧠 when ready
ETA: Request Socrates estimate after review