📄 icarus-telegram-ingestion.md 7,609 bytes Apr 26, 2026 📋 Raw

📱 Icarus Telegram Ingestion

Approach: Telegram bot receives images/PDFs → forwards to Icarus API → returns briefing card


User Flow

1. User receives field trip form via email
2. Screenshots or downloads PDF
3. Opens Telegram  @IcarusTestBot
4. Sends image/PDF to bot
5. Bot responds with briefing card
6. (Optional) "Add to calendar?" inline button

Implementation Options

How it works:
- Icarus API has a Telegram webhook endpoint: POST /telegram/webhook
- Bot sends documents to this endpoint
- Icarus processes → generates briefing → replies via Telegram API

Pros:
- ✅ Single codebase
- ✅ Natural extension of existing API
- ✅ Easy to add calendar integration later

Cons:
- ❌ Slightly more complex deployment (webhook setup)
- ❌ Need Telegram bot token in Icarus config

Architecture:

User  Telegram  @IcarusTestBot  Icarus API (/telegram/webhook)
                                              
                                         Vision pipeline
                                              
                                         Telegram API (send briefing)

Option B: Wadsworth Proxy (Simpler, Immediate)

How it works:
- User sends document to Wadsworth (in this DM)
- Wadsworth receives, forwards to Icarus API
- Wadsworth returns briefing card to user

Pros:
- ✅ Works immediately — no webhook setup
- ✅ No bot token management in Icarus
- ✅ User already knows to message Wadsworth

Cons:
- ❌ Not "product-like" — still feels like asking the assistant
- ❌ Wadsworth becomes single point of failure
- ❌ Doesn't scale to other users (Aundrea, kids)


Recommendation: Option A (Bot Handler)

Why: It's the product path. Telegram is Icarus's natural interface.


Implementation Plan

1. Telegram Bot Config

File: icarus/core/config/staging.py (add)

# Telegram Bot
TELEGRAM_BOT_TOKEN = os.environ.get("TELEGRAM_BOT_TOKEN")
TELEGRAM_WEBHOOK_SECRET = os.environ.get("TELEGRAM_WEBHOOK_SECRET", "random-secret")
TELEGRAM_ALLOWED_USERS = ["8386527252"]  # Matt's ID for now

File: icarus/core/telegram/bot_handler.py (create)

"""Telegram bot handler for document ingestion."""
from fastapi import Request, HTTPException
import httpx
from icarus.core.vision.pipeline import process_attachment
from icarus.core.config.staging import TELEGRAM_BOT_TOKEN, TELEGRAM_ALLOWED_USERS

async def handle_telegram_webhook(request: Request):
    """Handle incoming Telegram updates."""
    data = await request.json()

    # Validate user
    user_id = data.get("message", {}).get("from", {}).get("id")
    if str(user_id) not in TELEGRAM_ALLOWED_USERS:
        return {"ok": True}  # Silently ignore unauthorized

    # Check for document/photo
    message = data.get("message", {})
    chat_id = message.get("chat", {}).get("id")

    # Handle photo
    if "photo" in message:
        photo = message["photo"][-1]  # Largest size
        file_id = photo["file_id"]
        await process_and_reply(file_id, chat_id, "image")

    # Handle document (PDF, etc.)
    elif "document" in message:
        doc = message["document"]
        file_id = doc["file_id"]
        mime_type = doc.get("mime_type", "")
        if "pdf" in mime_type or doc.get("file_name", "").endswith(".pdf"):
            await process_and_reply(file_id, chat_id, "pdf")
        else:
            await send_message(chat_id, "📄 Please send PDF or image files only.")

    return {"ok": True}

async def process_and_reply(file_id: str, chat_id: int, file_type: str):
    """Download file, process through vision, send briefing."""
    # 1. Get file from Telegram
    async with httpx.AsyncClient() as client:
        # Get file path
        resp = await client.get(
            f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/getFile",
            params={"file_id": file_id}
        )
        file_path = resp.json()["result"]["file_path"]

        # Download file
        file_resp = await client.get(
            f"https://api.telegram.org/file/bot{TELEGRAM_BOT_TOKEN}/{file_path}"
        )
        content = file_resp.content

    # 2. Process through vision pipeline
    email_meta = {
        "from": f"telegram:{chat_id}",
        "subject": f"Telegram {file_type}",
        "date": "now",
        "to": "family"
    }

    briefing = await process_attachment(email_meta, content, f"upload.{file_type}")

    # 3. Format and send response
    text = format_briefing(briefing)
    await send_message(chat_id, text)

async def send_message(chat_id: int, text: str):
    """Send message via Telegram Bot API."""
    async with httpx.AsyncClient() as client:
        await client.post(
            f"https://api.telegram.org/bot{TELEGRAM_BOT_TOKEN}/sendMessage",
            json={"chat_id": chat_id, "text": text, "parse_mode": "Markdown"}
        )

def format_briefing(briefing: dict) -> str:
    """Format briefing card for Telegram."""
    return f"""📋 *{briefing['title']}*

{briefing['summary']}

*Key Details:*
{chr(10).join(f"• {k}: {v}" for k, v in briefing.get('key_details', {}).items())}

*Suggested Actions:*
{chr(10).join(f"• {a}" for a in briefing.get('suggested_actions', []))}
"""

2. API Endpoint

File: icarus/core/api.py (add)

from icarus.core.telegram.bot_handler import handle_telegram_webhook

@app.post("/telegram/webhook")
async def telegram_webhook(request: Request):
    """Receive Telegram bot updates."""
    return await handle_telegram_webhook(request)

3. Webhook Setup

One-time setup:

# Set webhook to point to Icarus
# Run locally or via curl:

TELEGRAM_TOKEN="8469114191:AAG6w4-uo4VZ8HV3d_wGUOf-KFuvJhSMCbw"
WEBHOOK_URL="https://icarus-test.hoffdesk.com/telegram/webhook"

curl -X POST "https://api.telegram.org/bot${TELEGRAM_TOKEN}/setWebhook" \
    -d "url=${WEBHOOK_URL}"

# Verify
curl "https://api.telegram.org/bot${TELEGRAM_TOKEN}/getWebhookInfo"

4. Security

  • Webhook endpoint validates Telegram signature (optional for MVP)
  • Only allowlisted user IDs can trigger processing
  • Bot token stored in staging.env (already there)

Alternative: Polling (No Webhook)

If webhook setup is problematic, bot can poll Telegram:

# In a background task or separate process
while True:
    updates = await get_updates(offset=last_update_id)
    for update in updates:
        await handle_update(update)
    await asyncio.sleep(5)

Cons: Requires persistent process, not ideal for API server.


User Experience

User: [sends photo of field trip form]

Icarus Bot:
📋 *Field Trip: Museum of Science*

Sully's class is visiting the Museum of Science on May 15, 2026...

*Key Details:*
 date: 2026-05-15
 time: 9:00 AM - 2:00 PM
 location: Museum of Science
 cost: $15
 action_required: Permission slip due May 8

*Suggested Actions:*
 Sign permission slip
 Pack lunch (or buy $8)
 Set reminder for 8:30 AM departure

[Add to Calendar] [Mark Done]

Decision

Approach Time Product Feel Effort
A. Bot Handler (webhook) 2-3 hours ⭐⭐⭐⭐⭐ Medium
B. Wadsworth Proxy 30 min ⭐⭐ Low

Recommendation: Option A — Telegram bot is the natural interface for document ingestion.

Your call: Implement bot handler, or start with Wadsworth proxy for immediate testing?