📄 brain-query-temporal-spec.md 9,625 bytes Apr 30, 2026 📋 Raw

Brain Query Interface — Temporal Weighting Spec

Status: Draft — Pending Director Review
Owner: Socrates (Backend Architecture)
Date: 2026-04-30
Phase: 6.3 — Pre-architecture pattern definition


Problem Statement

Documents have temporal relevance. An email from yesterday about "half-day Friday" overrides a 2024 newsletter mentioning the same topic. Without temporal weighting, Brain Query returns stale answers with equal confidence to fresh ones.

Example failure mode:

Q: "Is Friday a half-day for the kids?"
A: "Yes, all Fridays are half-days per the 2024-2025 school calendar" Wrong. The 2025-2026 calendar changed this. The correct answer was in 
   yesterday's principal email.

Query Pattern Taxonomy

Pattern 1: Logistical (Temporal-Critical)

Attribute Value
Examples "Is Friday a half-day?", "Is practice cancelled today?", "What's the early dismissal time?"
Temporal sensitivity HIGH — answer changes with new information
Freshness decay Exponential — 7-day half-life for school logistics
Source priority Calendar > Email (last 7 days) > Newsletter (last 30 days) > Historical docs
Confidence threshold Only answer if source ≤ 14 days old for logistics queries

Retrieval strategy:
1. Time-boost recent documents (see Temporal Scoring below)
2. Recency cutoff: ignore documents > 30 days old unless no newer source exists
3. Source-type weighting: calendar_event (1.0) > email (0.9) > newsletter (0.7) > static_pdf (0.5)


Pattern 2: Entity Fetching (Precision-Critical)

Attribute Value
Examples "What size is our HVAC filter?", "When was Maggie's last vet visit?", "What's our WiFi password?"
Temporal sensitivity MEDIUM — entity changes slowly (filter size) but visits are date-specific
Freshness decay Linear — 90-day half-life for invoices, 365-day for manuals
Source priority Invoice/receipt (1.0) > Manual (0.8) > Email (0.6) > Historical mention
Confidence threshold Return with source document link; allow user to verify

Retrieval strategy:
1. Exact-match boosting: prefer documents containing the entity name ("HVAC", "filter", "16x25x4")
2. Structural extraction: invoices/receipts have higher precision than narrative text
3. Temporal tiebreaker: if multiple sources mention same entity, prefer most recent


Pattern 3: Historical (Recall-Critical)

Attribute Value
Examples "What was the name of the roofer we used?", "When did we replace the water heater?", "Who was Harper's teacher in 2023?"
Temporal sensitivity LOW — past events don't change
Freshness decay None — all historical documents equally relevant
Source priority Receipt/invoice (1.0) > Email chain (0.8) > Photo metadata (0.7) > Narrative mention
Confidence threshold Lower threshold acceptable; user expects to verify

Retrieval strategy:
1. No recency penalty — all documents weighted equally
2. Entity-linking: connect "roofer" to company names in receipts
3. Date extraction: pull completion dates from invoices, not just mention dates


Temporal Scoring Algorithm

Base Score

def temporal_score(document, query_pattern):
    """
    Calculate temporal relevance score for a document.
    Returns: float [0.0, 1.0]
    """

    # Document metadata
    doc_date = document.ingestion_date  # When Icarus processed it
    source_date = document.source_date   # When the original was created (email sent, receipt dated)
    doc_type = document.type            # email | invoice | receipt | manual | newsletter | calendar_event

    # Query pattern parameters
    pattern = query_pattern.type        # logistical | entity | historical
    decay_model = query_pattern.decay   # exponential | linear | none
    half_life_days = query_pattern.half_life

    # Use the more recent of ingestion or source date
    effective_date = max(doc_date, source_date)
    age_days = (now - effective_date).days

    # Decay function
    if decay_model == "exponential":
        recency_multiplier = 2 ** (-age_days / half_life_days)
    elif decay_model == "linear":
        recency_multiplier = max(0, 1 - (age_days / (half_life_days * 2)))
    else:  # none
        recency_multiplier = 1.0

    # Source type weight
    source_weights = {
        "calendar_event": 1.0,
        "email": 0.9,
        "invoice": 0.95,
        "receipt": 0.95,
        "newsletter": 0.7,
        "manual": 0.8,
        "static_pdf": 0.5,
        "photo": 0.6
    }
    type_weight = source_weights.get(doc_type, 0.5)

    # Combined score
    score = recency_multiplier * type_weight

    # Logistical queries: hard recency cutoff
    if pattern == "logistical" and age_days > 30:
        score *= 0.1  # Drastically reduce, don't zero (emergency fallback)

    return score

Query Pattern Configuration

query_patterns:
  logistical:
    decay: exponential
    half_life_days: 7
    recency_cutoff: 30
    source_priority: [calendar_event, email, newsletter, static_pdf]

  entity:
    decay: linear
    half_life_days: 90
    recency_cutoff: null  # No hard cutoff
    source_priority: [invoice, receipt, manual, email, static_pdf]

  historical:
    decay: none
    half_life_days: null
    recency_cutoff: null
    source_priority: [invoice, receipt, email, photo, static_pdf]

Implementation Notes for Socrates

1. Document Metadata Requirements

Every document ingested into Icarus must have:
- source_date: When the original was created (not when Icarus processed it)
- doc_type: Classification at ingestion time
- ingestion_date: Auto-populated by pipeline

Migration: Backfill existing documents using:
- Email: Date header
- PDF/Photo: EXIF creation date or file modification date
- Invoice: Extracted service date

2. Vector Store Integration

ChromaDB supports metadata filtering. Store temporal scores as metadata, not just in the embedding:

collection.add(
    documents=[chunk.text],
    embeddings=[chunk.embedding],
    metadatas=[{
        "doc_id": doc.id,
        "source_date": doc.source_date.isoformat(),
        "ingestion_date": doc.ingestion_date.isoformat(),
        "doc_type": doc.type,
        "temporal_score": calculate_score(doc, pattern)
    }],
    ids=[chunk.id]
)

3. Query-Time Boosting

Two approaches — implement #1 first, evaluate #2 later:

Approach 1: Metadata Filtering (MVP)

# Pre-filter by recency, then semantic search
results = collection.query(
    query_embeddings=[query_embedding],
    where={
        "$or": [
            {"source_date": {"$gte": fourteen_days_ago}},
            {"doc_type": {"$in": ["calendar_event", "manual"]}}  # Timeless docs
        ]
    },
    n_results=10
)
# Re-rank by temporal_score metadata

Approach 2: Hybrid Search (Phase 7)
- Semantic score + temporal score at query time
- Requires custom re-ranking or advanced vector store

4. Confidence Scoring

def answer_confidence(results, pattern):
    """
    Determine if we have enough confidence to answer.
    """
    if not results:
        return 0.0, "No relevant documents found"

    top_score = results[0].temporal_score

    if pattern == "logistical":
        if top_score < 0.3:
            return 0.2, "Stale information — verify with school/organizer"
        if top_score < 0.6:
            return 0.5, "Possibly outdated — check for newer updates"
        return 0.9, "Recent information found"

    elif pattern == "entity":
        if top_score < 0.4:
            return 0.4, "Entity found in old document — verify if still current"
        return 0.85, "Entity found with source document"

    else:  # historical
        return 0.8, "Historical record found"

Failure Modes & Mitigations

Failure Example Mitigation
Stale answer "Friday is half-day" (from 2024 calendar) Recency cutoff + source weighting
Multiple conflicting answers Newsletter says 3:00 PM, email says 2:45 PM Temporal score picks email; show both with dates
Missing context "Maggie's visit" could be vet or grandparent Entity linking + disambiguation prompt
Wrong entity "HVAC filter" vs "furnace filter" vs "air filter" Synonym expansion in query preprocessing

Success Criteria

  1. Logistical query: "Is Friday a half-day?" → Returns answer from most recent email/newsletter, not stale calendar PDF
  2. Entity query: "What size is our HVAC filter?" → Returns exact dimensions from latest invoice with source link
  3. Historical query: "What was the roofer's name?" → Returns company name from 2023 receipt, no recency penalty
  4. Confidence threshold: System says "I don't know" or "verify this" rather than guessing with stale data

Next Steps

  1. Matt review — Approve or modify temporal weighting parameters
  2. Socrates architecture — Build ChromaDB integration with metadata filtering
  3. Test corpus — Seed with 30-50 documents across all types and dates
  4. Evaluation — Run 20 queries per pattern, measure precision@1 and recall@3

Matt's call: Are the decay models, half-lives, and source weights right? Should we add query pattern detection (auto-classify logistical vs entity vs historical) or start with explicit pattern tags?