📄 chandra-2-assessment.md 4,809 bytes Apr 21, 2026 📋 Raw

Chandra 2.0 OCR — Assessment for HoffDesk

Date: 2026-04-21
Source: datalab-to/chandra (GitHub), Apache 2.0 + OpenRAIL-M
Researcher: Wadsworth 📋


Executive Summary

Verdict: High potential, deferred implementation.

Chandra 2.0 is a state-of-the-art open-source OCR model that significantly outperforms alternatives on complex documents. However, it requires substantial GPU resources (H100-class for production throughput) and has unclear immediate value for HoffDesk's current workflows.


What It Does Exceptionally Well

Capability Performance Notes
Complex tables 89.9% (olmocr benchmark) Best-in-class for financial/scientific tables
Math/STEM 89.3% LaTeX-aware extraction
Handwriting Strong Cursive, mixed print/cursive
Multilingual 77.8% avg (90 languages) Beats Gemini 2.5 Flash
Layout preservation Native Outputs Markdown + HTML + JSON with structure
Forms/checkboxes Reconstructs accurately Critical for scanned documents
Old scans 49.8% (best available) Handles degraded historical docs

Key differentiator: Unlike simple OCR, Chandra preserves document structure — columns, tables, headers, captions, and reading order.


Technical Requirements

Mode Requirements Throughput Best For
vLLM Server H100 80GB (or equivalent) 1.44 pages/sec Production batch processing
HuggingFace 24GB+ VRAM (RTX 3090/4090) Slower Local development, small batches
Hosted API API key ($) Fastest Evaluation, low-volume needs

HoffDesk Fit Analysis

Current Workflows — Low Match

Workflow OCR Need? Chandra Value
Family calendar No None — structured data from APIs
Blog content No None — Markdown authoring
Email processing Minimal Could extract PDF attachments, but rare
Agent coordination No Text-based already

Potential Future Workflows — High Match

Use Case Value Priority
Document archive Convert scanned family docs → searchable Markdown Medium
Receipt/expense extraction Table extraction from photos Medium
Historical photo OCR Extract text from old scanned photos Low
Kids' homework help Math problem extraction from photos Low
Mail digitization Scan postal mail → structured data Low

Integration Path (If Pursued)

Option 1: Hosted API (Immediate)

  • Effort: Minimal — REST API calls
  • Cost: ~$0.01/page at volume
  • Pros: No infrastructure, best accuracy
  • Cons: External dependency, ongoing cost, data leaves home
  • Effort: High — requires GPU server setup
  • Hardware: Need dedicated inference box (H100 or 2x A6000)
  • Cost: $3K–$15K hardware or cloud GPU
  • Pros: Sovereign, no external data
  • Cons: Capital expense, power, maintenance

Option 3: Gaming PC (RTX 3080 Ti) — Marginal

  • Effort: Medium — HuggingFace backend
  • Throughput: ~0.3 pages/sec (estimated)
  • Pros: Uses existing hardware
  • Cons: Slow for batch jobs, blocks gaming use

Competitive Landscape

Model Accuracy Speed Cost Self-Hosted
Chandra 2 ⭐⭐⭐⭐⭐ ⭐⭐⭐ $$$
olmOCR 2 ⭐⭐⭐⭐ ⭐⭐⭐⭐ Free
Mistral OCR ⭐⭐⭐ ⭐⭐⭐⭐⭐ $
GPT-4o Vision ⭐⭐⭐⭐ ⭐⭐⭐⭐⭐ $$
Tesseract ⭐⭐ ⭐⭐⭐⭐⭐ Free

Chandra wins on accuracy for complex documents. olmOCR 2 is the open-source alternative with better speed/accuracy tradeoff for simpler docs.


Recommendation

Defer implementation.

Chandra 2.0 is impressive technology without a clear immediate use case for HoffDesk. Current workflows don't involve document scanning or complex OCR needs.

Revisit when:
- Document archive workflow becomes priority
- Receipt/expense tracking enters roadmap
- Kids start bringing home complex math worksheets requiring digitization

If revisiting:
1. Start with hosted API for evaluation
2. Measure actual usage patterns
3. Justify local GPU investment only if volume warrants


Quick Test (Optional)

If curious, can install on Gaming PC for single-page tests:

pip install chandra-ocr[hf]
chandra ~/test-receipt.jpg ./output --method hf

This validates quality without infrastructure commitment.


Assessment complete. Technology filed for future reference.