# Email Pipeline v2 — Smart Classification & Routing **Status:** ✅ COMPLETE **Assigned by:** Wadsworth 📋 **Implemented by:** Socrates 🧠 **Date:** 2026-04-23 ## Problem Solved Newsletters were silently dropped by `skip_keywords` filter. Now they are classified and summarized. ## Architecture ``` Email arrives at assistant@hoffdesk.com ↓ [EmailClassifier] (qwen2.5-coder:7b, local) ↓ ├─ APPOINTMENT → AppointmentHandler → Calendar + Telegram ├─ NEWSLETTER → NewsletterHandler → Summarize + Telegram (no calendar) ├─ FAMILY → FamilyHandler → Calendar + Telegram └─ OTHER → OtherHandler → Telegram notification only ``` ## Files Created/Modified ### New Files | File | Purpose | |------|---------| | `family/classifier.py` | LLM-based email classification | | `family/handlers/__init__.py` | Handler exports | | `family/handlers/appointment.py` | Extract event, create calendar, notify | | `family/handlers/newsletter.py` | Summarize content, notify only | | `family/handlers/family.py` | Extract event if present, always notify | | `family/handlers/other.py` | Basic notification | ### Modified Files | File | Change | |------|--------| | `family/pipeline.py` | Added classification + routing logic | | `family/email.py` | Removed `skip_keywords` filter | ### Removed - `skip_keywords = ["re:", "fw:", "unsubscribe", "newsletter", "promotion", "sale"]` - `should_process()` hardcoded filter ## Classification Results | Test | Subject | Classification | Time | |------|---------|---------------|------| | 1 | "Your appointment with Dr. Smith..." | appointment | ~2s | | 2 | "Weekly Tech Digest: AI News..." | newsletter | ~2s | | 3 | "Sullivan's soccer practice..." | family | ~2s | | 4 | "Your Amazon order has shipped" | other | ~2s | | 5 | "School calendar update..." | family | ~2s | **Accuracy:** 5/5 (100%) after prompt refinement ## Newsletter Handler **Input:** Full newsletter email (truncated to 3000 chars) **Output:** Telegram message with bullet summary **Example:** ``` 📰 Weekly AI Digest: LLM Updates • OpenAI released GPT-5 with enhanced reasoning skills. • Anthropic launched Claude 4, featuring a 200K context window. • Local models like phi4:14b show competitive performance to cloud models. ``` **No calendar event created** ✅ ## Key Design Decisions 1. **Classifier uses sender domain** to distinguish family vs appointment - coach@, school@ → family - doctor@, clinic@ → appointment 2. **Newsletter summarized, not calendar'd** - No event extraction attempt - Bullet points only - Truncated at 3000 chars 3. **Family emails always notify** - Tries to extract event for calendar - Always sends Telegram notification 4. **Local-first inference** - qwen2.5-coder:7b on Gaming PC - Temperature 0.1 for consistent classification - ~2s classification time ## Test Command ```bash cd ~/.openclaw/workspace-socrates/hoffdesk-api python3 test_email_routing.py ``` ## Integration Pipeline is wired into `family/router.py` endpoints: - `POST /admin/family/webhook/email` — Process incoming email - `POST /admin/family/pipeline/test` — Test with sample data ## Acceptance Criteria | Test | Result | |------|--------| | Newsletter arrives | ✅ Classified as `newsletter`, summarized, Telegram sent, no calendar | | Appointment arrives | ✅ Classified as `appointment`, calendar event created, Telegram sent | | Family email arrives | ✅ Classified as `family`, calendar + Telegram | | Random email arrives | ✅ Classified as `other`, Telegram notification only | ## Next Steps - Deploy and test with real emails - Monitor classification accuracy - Add learning loop: manual correction → improve prompts