📄 icarus-phase-6-roadmap.md 10,659 bytes Apr 27, 2026 📋 Raw

🎯 Icarus Phase 6 — 'Intelligence Amplification' Roadmap

Director: Matt
Sprint Goal: Transform document routing into proactive family intelligence
Timeline: Post-UAT (following Phase 5 completion)
Status: APPROVED — Pending Phase 5 UAT success


Phase 6 Pillars

Pillar Description Sovereign Value
1. Document-Driven Maintenance Auto-extract service dates from invoices/receipts High — No cloud service offers this
2. Brain Query Interface Natural language search across family documents High — Private semantic search
3. Recipe Intelligence Full meal planning with grocery integration Medium — Convenience, not unique
4. Offline Fallback Brain mode when cloud unavailable Critical — Sovereign resilience

Pillar 1: Document-Driven Maintenance (Approved for Phase 6)

Vision

Documents don't just route — they inform. Vehicle service receipts, vet invoices, HVAC service reports automatically populate the family's maintenance tracker.

Input Documents

Type Source Extractable Data
Vehicle Service Receipt Email PDF/Photo Next oil change date, mileage, service type
Vet Invoice Email PDF/Photo Next vaccination due, medication refills
HVAC Service Report Email PDF Filter size, next service due, warranty info
Home Inspection Scanned PDF Smoke detector battery dates, gutter cleaning schedule
Appliance Manual PDF Upload Model numbers, warranty expiration, filter specs

Pipeline Architecture

┌─────────────────────────────────────────────────────────────┐
                    Document Ingestion                        
  (Receipt, Invoice, Report, Manual)                        
└─────────────────────────────────────────────────────────────┘
                              
                              
┌─────────────────────────────────────────────────────────────┐
              qwen3-vl Vision + OCR Pipeline                 
   Extract: Dates, amounts, service types                 
   Identify: Document category (vehicle/vet/home)          
   Infer: Next due date from service interval             
└─────────────────────────────────────────────────────────────┘
                              
                              
┌─────────────────────────────────────────────────────────────┐
            Structured Data Extraction (JSON)               
  {                                                          
    "document_type": "vehicle_service",                     
    "service_date": "2026-04-15",                         
    "next_service_due": "2026-07-15",                      
    "service_type": "oil_change",                         
    "mileage": 45200,                                     
    "interval_miles": 5000,                               
    "interval_months": 3                                  
  }                                                          
└─────────────────────────────────────────────────────────────┘
                              
                              
┌─────────────────────────────────────────────────────────────┐
              Maintenance Tracker Update                      
   Create or update maintenance item                       
   Set calculated due date                                 
   Link to source document (receipt image)                
   Queue confirmation to family group                    
└─────────────────────────────────────────────────────────────┘
                              
                              
┌─────────────────────────────────────────────────────────────┐
              Proactive Notification                        
  "Oil change due in 7 days (last service: Apr 15).         │
   Source: [View Receipt]"                                  │
└─────────────────────────────────────────────────────────────┘

Technical Implementation

New Module: icarus/core/extractors/maintenance.py

"""Extract maintenance-relevant data from documents.

Uses vision + structured extraction to identify:
- Vehicle service records
- Veterinary invoices
- Home service reports
- Appliance warranties
"""

async def extract_maintenance_data(
    document: Document,
    document_type: str  # "auto-detected" or hint
) -> MaintenanceEvent | None:
    """Extract structured maintenance data from document."""

    # Vision pipeline extracts raw text + visual layout
    vision_result = await vision_pipeline.extract(document)

    # LLM extracts structured data with confidence scoring
    extracted = await extract_with_prompt(
        text=vision_result.text,
        prompt=MAINTENANCE_EXTRACTION_PROMPT,
        schema=MaintenanceEventSchema
    )

    if extracted.confidence < 0.8:
        # Queue for human verification
        await queue_for_review(document, extracted)
        return None

    return MaintenanceEvent(
        item_name=extracted.service_type,
        category=extracted.category,  # vehicle, pet, home
        who=extracted.subject,  # "Matt's Car", "Maggie"
        last_done=extracted.service_date,
        next_due=extracted.next_service_due,
        source_document=document.id,
        confidence=extracted.confidence
    )

Integration Points:
- Hook into existing document routing pipeline
- When document_type inferred as "receipt/invoice", run maintenance extractor
- If maintenance data found, update tracker before routing to member

User Flow

  1. Receipt arrives (email with PDF attachment)
  2. Icarus processes — vision OCR extracts text
  3. Classifier detects — "vehicle service receipt"
  4. Maintenance extractor runs — extracts dates, service type
  5. Confirmation sent — "Detected oil change on Apr 15. Next due: Jul 15. Correct?"
  6. User confirms (inline button) — added to maintenance tracker
  7. Future notification — "Oil change due in 7 days" with link to receipt

Comparison: Old vs New

Aspect Old (YAML) New (Document-Driven)
Data entry Manual YAML edits Automatic from receipts
Accuracy Human memory dependent Extracted from source documents
Sovereign value Low (replaceable by calendar) High (no cloud service does this)
Maintenance Manual updates Self-updating from document stream
Audit trail None Receipt linked to reminder

Success Metrics

  • 80% of receipts auto-extracted without human correction
  • Zero manual YAML edits required
  • Family confirms utility via usage (not survey)

Dependencies

  • ✅ Phase 5 UAT complete (document routing proven)
  • ✅ Vision pipeline stable (qwen3-vl)
  • ✅ Confidence scoring working
  • ⏳ Optional: Brain integration for historical lookup

Pillar 2: Finance Intelligence & Subscription Tracking

Trigger: Real-world need (credit card compromise incident, 2026-04-27)

Problem Statement

Credit card changes create cascading subscription failures. No visibility into:
- Where cards are saved
- Which subscriptions are active
- Upcoming expiration/charge failures

Sovereign Value: HIGH

No cloud service combines document extraction with subscription intelligence without requiring bank credential sharing.

Architecture

Document Sources:
├── Receipts/Invoices  Extract: merchant, amount, frequency
├── Bank Statements  Extract: recurring charges, new merchants
├── Email Alerts  "Card expires soon", "Payment failed"
└── Manual Entry  Subscription inventory

Ledger Output:
├── Subscription inventory (active, cost, frequency)
├── Card-on-file registry (where each card lives)
├── Payment failure alerts (immediate Telegram)
└── Annual spending by category (year-end report)

Integration Points

  • Hook into existing document extraction pipeline
  • Flag payment-related emails as HIGH PRIORITY
  • Telegram notifications for failures/expirations
  • Optional: Export for tax/accounting

Deferred from: Phase 5 UAT (2026-04-27)

Rationale: Critical family logistics (Icarus UAT) takes precedence over finance automation.


Pillar 3: Brain Query Interface

Natural language search across all processed family documents. "What was Harper's teacher's name last year?" → Answer from document corpus.

Pillar 4: Recipe Intelligence

Meal planning with grocery list auto-generation. Recipe → ingredients → zones → Costco route optimization.

Pillar 5: Offline Fallback (Brain Mode)

When cloud unavailable, degrade gracefully to local-only operation using embedded document summaries.


Decision Record

Date: 2026-04-27
Decision: Sunset YAML-based maintenance tracking (Phase 1-5), resurrect as Document-Driven Maintenance in Phase 6
Rationale: Standalone maintenance tracking is low sovereign value; document-driven extraction is high sovereign value and aligns with Icarus core mission
Veto Authority: Matt (Director)


This roadmap evolves. Update as Phase 5 UAT informs Phase 6 priorities.