# Brain Intelligence Build — Phase 4 Spec **Status:** Ready for Socrates **Created:** 2026-04-30 16:51 UTC **Dispatcher:** Wadsworth --- ## Objective Wire existing RAG components into a proper `/brain/query` endpoint with source attribution and context assembly. ## Why Now The Family Brain (`family_brain.py`) has full RAG (query, answer, ingest) but lacks: 1. HTTP API endpoint (`/brain/query?q=...`) 2. Source attribution in "Found in: {filename}" format 3. Proper context assembly for synthesis Matt wants to ask questions like "what do the kids need for the field trip?" and get answers with clear provenance. ## Existing Components ### Core Files - `services/icarus/core/family_brain.py` — Full RAG implementation - `query()` — semantic search via ChromaDB - `answer()` — retrieve + synthesize with local LLM - `ingest_email()` / `ingest_newsletter()` — document ingestion - `_build_synthesis_prompt()` — prompt template - `_build_hybrid_prompt()` — calendar + brain hybrid - `services/icarus/core/embeddings.py` — Ollama client - `get_embedding()` / `get_embedding_async()` — nomic-embed-text - `services/icarus/core/memory/engine.py` — Learned routing patterns - `learn_assignment()` / `lookup_assignment()` — family member routing - `services/icarus/core/api.py` — FastAPI endpoints (lines ~1-350) - Has `/vision/*`, `/system/state`, `/event_graph/*`, dashboard - **Missing:** `/brain/*` endpoints ### Config - `services/icarus/core/config/staging.py` - `CHROMA_DB_PATH` — isolated staging data - `OLLAMA_BASE_URL` — Gaming PC via Tailscale - `OLLAMA_EMBED_MODEL` — nomic-embed-text - `LLM_MODEL` / `LLM_URL` — Beelink localhost ## Sovereign Constraints | Constraint | Implementation | |------------|----------------| | Local embeddings | nomic-embed-text via Gaming PC Tailscale | | LLM synthesis | qwen2.5-coder:7b on Beelink localhost | | No cloud default | Overflow/fallback only | | ENV isolation | staging config paths, not production | | Zero costco_route imports | Clean-room build | ## Deliverables ### 1. `/brain/query` Endpoint **Method:** `GET /brain/query?q={question}&top_k=3&source=email|newsletter|all` **Response:** ```json { "answer": "Sullivan needs a signed permission slip and $5 for the field trip to the zoo on Friday.", "sources": [ { "attribution": "Found in: Field Trip Permission Slip, 2025-09-15", "subject": "Field Trip Permission Slip", "date": "2025-09-15", "score": 0.92, "source_type": "email" } ], "confidence": "high", "query_time_ms": 1450 } ``` **Requirements:** - `answer` — conversational, not robotic (uses existing `_build_synthesis_prompt()`) - `sources.attribution` — exactly "Found in: {subject}, {date}" - `confidence` — high (>0.7), medium (0.3-0.7), low (<0.3) - `query_time_ms` — response timing - Handle empty ChromaDB gracefully: "I don't see that in the emails I've processed." ### 2. `/brain/stats` Endpoint **Method:** `GET /brain/stats` **Response:** ```json { "total_documents": 47, "collection": "family_knowledge", "db_path": "/home/hoffmann_admin/.icarus/staging/chroma", "last_ingest": "2025-09-15T14:32:00-05:00" } ``` ### 3. Context Assembly Enhancement The existing `answer()` function needs refinement: - Deduplicate sources before synthesis (same email, multiple chunks) - Sort by score descending - Limit context to top 3 chunks to avoid prompt bloat - Include source metadata in synthesis prompt for attribution ## Implementation Notes ### Where to Add Endpoints In `services/icarus/core/api.py`, after existing `/api/event_graph/*` endpoints (around line 290): ```python # ============================================================================= # Brain Intelligence Endpoints # ============================================================================= @app.get("/brain/query") async def brain_query(q: str, top_k: int = 3, source: str = "all"): """Query the Family Brain with natural language.""" from icarus.core.family_brain import answer, query # ... implementation @app.get("/brain/stats") async def brain_stats(): """Get Family Brain statistics.""" from icarus.core.family_brain import stats # ... implementation ``` ### Error Handling - ChromaDB not initialized → return empty state, not 500 - Ollama unreachable → "Brain service temporarily unavailable" - Invalid source filter → ignore filter, search all ## Do NOT - Import from costco_route - Use cloud embeddings by default - Break existing vision pipeline - Add new dependencies (use existing chromadb, httpx) ## Do - Read `family_brain.py` thoroughly first - Use existing patterns from `/api/*` endpoints - Test with empty ChromaDB - Maintain FastAPI patterns ## Success Criteria 1. [ ] `GET /brain/query?q=test` returns valid JSON 2. [ ] Source attribution uses exact "Found in: {subject}, {date}" format 3. [ ] Confidence scoring accurate 4. [ ] `/brain/stats` returns collection metadata 5. [ ] No regression in existing endpoints --- **Handoff:** Dispatch to Socrates 🧠 when ready **ETA:** Request Socrates estimate after review