Brain Query Interface — Temporal Weighting Spec
Status: Draft — Pending Director Review
Owner: Socrates (Backend Architecture)
Date: 2026-04-30
Phase: 6.3 — Pre-architecture pattern definition
Problem Statement
Documents have temporal relevance. An email from yesterday about "half-day Friday" overrides a 2024 newsletter mentioning the same topic. Without temporal weighting, Brain Query returns stale answers with equal confidence to fresh ones.
Example failure mode:
Q: "Is Friday a half-day for the kids?"
A: "Yes, all Fridays are half-days per the 2024-2025 school calendar"
→ Wrong. The 2025-2026 calendar changed this. The correct answer was in
yesterday's principal email.
Query Pattern Taxonomy
Pattern 1: Logistical (Temporal-Critical)
| Attribute | Value |
|---|---|
| Examples | "Is Friday a half-day?", "Is practice cancelled today?", "What's the early dismissal time?" |
| Temporal sensitivity | HIGH — answer changes with new information |
| Freshness decay | Exponential — 7-day half-life for school logistics |
| Source priority | Calendar > Email (last 7 days) > Newsletter (last 30 days) > Historical docs |
| Confidence threshold | Only answer if source ≤ 14 days old for logistics queries |
Retrieval strategy:
1. Time-boost recent documents (see Temporal Scoring below)
2. Recency cutoff: ignore documents > 30 days old unless no newer source exists
3. Source-type weighting: calendar_event (1.0) > email (0.9) > newsletter (0.7) > static_pdf (0.5)
Pattern 2: Entity Fetching (Precision-Critical)
| Attribute | Value |
|---|---|
| Examples | "What size is our HVAC filter?", "When was Maggie's last vet visit?", "What's our WiFi password?" |
| Temporal sensitivity | MEDIUM — entity changes slowly (filter size) but visits are date-specific |
| Freshness decay | Linear — 90-day half-life for invoices, 365-day for manuals |
| Source priority | Invoice/receipt (1.0) > Manual (0.8) > Email (0.6) > Historical mention |
| Confidence threshold | Return with source document link; allow user to verify |
Retrieval strategy:
1. Exact-match boosting: prefer documents containing the entity name ("HVAC", "filter", "16x25x4")
2. Structural extraction: invoices/receipts have higher precision than narrative text
3. Temporal tiebreaker: if multiple sources mention same entity, prefer most recent
Pattern 3: Historical (Recall-Critical)
| Attribute | Value |
|---|---|
| Examples | "What was the name of the roofer we used?", "When did we replace the water heater?", "Who was Harper's teacher in 2023?" |
| Temporal sensitivity | LOW — past events don't change |
| Freshness decay | None — all historical documents equally relevant |
| Source priority | Receipt/invoice (1.0) > Email chain (0.8) > Photo metadata (0.7) > Narrative mention |
| Confidence threshold | Lower threshold acceptable; user expects to verify |
Retrieval strategy:
1. No recency penalty — all documents weighted equally
2. Entity-linking: connect "roofer" to company names in receipts
3. Date extraction: pull completion dates from invoices, not just mention dates
Temporal Scoring Algorithm
Base Score
def temporal_score(document, query_pattern):
"""
Calculate temporal relevance score for a document.
Returns: float [0.0, 1.0]
"""
# Document metadata
doc_date = document.ingestion_date # When Icarus processed it
source_date = document.source_date # When the original was created (email sent, receipt dated)
doc_type = document.type # email | invoice | receipt | manual | newsletter | calendar_event
# Query pattern parameters
pattern = query_pattern.type # logistical | entity | historical
decay_model = query_pattern.decay # exponential | linear | none
half_life_days = query_pattern.half_life
# Use the more recent of ingestion or source date
effective_date = max(doc_date, source_date)
age_days = (now - effective_date).days
# Decay function
if decay_model == "exponential":
recency_multiplier = 2 ** (-age_days / half_life_days)
elif decay_model == "linear":
recency_multiplier = max(0, 1 - (age_days / (half_life_days * 2)))
else: # none
recency_multiplier = 1.0
# Source type weight
source_weights = {
"calendar_event": 1.0,
"email": 0.9,
"invoice": 0.95,
"receipt": 0.95,
"newsletter": 0.7,
"manual": 0.8,
"static_pdf": 0.5,
"photo": 0.6
}
type_weight = source_weights.get(doc_type, 0.5)
# Combined score
score = recency_multiplier * type_weight
# Logistical queries: hard recency cutoff
if pattern == "logistical" and age_days > 30:
score *= 0.1 # Drastically reduce, don't zero (emergency fallback)
return score
Query Pattern Configuration
query_patterns:
logistical:
decay: exponential
half_life_days: 7
recency_cutoff: 30
source_priority: [calendar_event, email, newsletter, static_pdf]
entity:
decay: linear
half_life_days: 90
recency_cutoff: null # No hard cutoff
source_priority: [invoice, receipt, manual, email, static_pdf]
historical:
decay: none
half_life_days: null
recency_cutoff: null
source_priority: [invoice, receipt, email, photo, static_pdf]
Implementation Notes for Socrates
1. Document Metadata Requirements
Every document ingested into Icarus must have:
- source_date: When the original was created (not when Icarus processed it)
- doc_type: Classification at ingestion time
- ingestion_date: Auto-populated by pipeline
Migration: Backfill existing documents using:
- Email: Date header
- PDF/Photo: EXIF creation date or file modification date
- Invoice: Extracted service date
2. Vector Store Integration
ChromaDB supports metadata filtering. Store temporal scores as metadata, not just in the embedding:
collection.add(
documents=[chunk.text],
embeddings=[chunk.embedding],
metadatas=[{
"doc_id": doc.id,
"source_date": doc.source_date.isoformat(),
"ingestion_date": doc.ingestion_date.isoformat(),
"doc_type": doc.type,
"temporal_score": calculate_score(doc, pattern)
}],
ids=[chunk.id]
)
3. Query-Time Boosting
Two approaches — implement #1 first, evaluate #2 later:
Approach 1: Metadata Filtering (MVP)
# Pre-filter by recency, then semantic search
results = collection.query(
query_embeddings=[query_embedding],
where={
"$or": [
{"source_date": {"$gte": fourteen_days_ago}},
{"doc_type": {"$in": ["calendar_event", "manual"]}} # Timeless docs
]
},
n_results=10
)
# Re-rank by temporal_score metadata
Approach 2: Hybrid Search (Phase 7)
- Semantic score + temporal score at query time
- Requires custom re-ranking or advanced vector store
4. Confidence Scoring
def answer_confidence(results, pattern):
"""
Determine if we have enough confidence to answer.
"""
if not results:
return 0.0, "No relevant documents found"
top_score = results[0].temporal_score
if pattern == "logistical":
if top_score < 0.3:
return 0.2, "Stale information — verify with school/organizer"
if top_score < 0.6:
return 0.5, "Possibly outdated — check for newer updates"
return 0.9, "Recent information found"
elif pattern == "entity":
if top_score < 0.4:
return 0.4, "Entity found in old document — verify if still current"
return 0.85, "Entity found with source document"
else: # historical
return 0.8, "Historical record found"
Failure Modes & Mitigations
| Failure | Example | Mitigation |
|---|---|---|
| Stale answer | "Friday is half-day" (from 2024 calendar) | Recency cutoff + source weighting |
| Multiple conflicting answers | Newsletter says 3:00 PM, email says 2:45 PM | Temporal score picks email; show both with dates |
| Missing context | "Maggie's visit" could be vet or grandparent | Entity linking + disambiguation prompt |
| Wrong entity | "HVAC filter" vs "furnace filter" vs "air filter" | Synonym expansion in query preprocessing |
Success Criteria
- Logistical query: "Is Friday a half-day?" → Returns answer from most recent email/newsletter, not stale calendar PDF
- Entity query: "What size is our HVAC filter?" → Returns exact dimensions from latest invoice with source link
- Historical query: "What was the roofer's name?" → Returns company name from 2023 receipt, no recency penalty
- Confidence threshold: System says "I don't know" or "verify this" rather than guessing with stale data
Next Steps
- Matt review — Approve or modify temporal weighting parameters
- Socrates architecture — Build ChromaDB integration with metadata filtering
- Test corpus — Seed with 30-50 documents across all types and dates
- Evaluation — Run 20 queries per pattern, measure precision@1 and recall@3
Matt's call: Are the decay models, half-lives, and source weights right? Should we add query pattern detection (auto-classify logistical vs entity vs historical) or start with explicit pattern tags?