📄 chandra-2-assessment.md 4,809 bytes Apr 21, 2026 📋 Raw

Chandra 2.0 OCR — Assessment for HoffDesk

Date: 2026-04-21
Source: datalab-to/chandra (GitHub), Apache 2.0 + OpenRAIL-M
Researcher: Wadsworth 📋

Executive Summary

Verdict: High potential, deferred implementation.

Chandra 2.0 is a state-of-the-art open-source OCR model that significantly outperforms alternatives on complex documents. However, it requires substantial GPU resources (H100-class for production throughput) and has unclear immediate value for HoffDesk's current workflows.

What It Does Exceptionally Well

Capability	Performance	Notes
Complex tables	89.9% (olmocr benchmark)	Best-in-class for financial/scientific tables
Math/STEM	89.3%	LaTeX-aware extraction
Handwriting	Strong	Cursive, mixed print/cursive
Multilingual	77.8% avg (90 languages)	Beats Gemini 2.5 Flash
Layout preservation	Native	Outputs Markdown + HTML + JSON with structure
Forms/checkboxes	Reconstructs accurately	Critical for scanned documents
Old scans	49.8% (best available)	Handles degraded historical docs

Key differentiator: Unlike simple OCR, Chandra preserves document structure — columns, tables, headers, captions, and reading order.

Technical Requirements

Mode	Requirements	Throughput	Best For
vLLM Server	H100 80GB (or equivalent)	1.44 pages/sec	Production batch processing
HuggingFace	24GB+ VRAM (RTX 3090/4090)	Slower	Local development, small batches
Hosted API	API key ($)	Fastest	Evaluation, low-volume needs

HoffDesk Fit Analysis

Current Workflows — Low Match

Workflow	OCR Need?	Chandra Value
Family calendar	No	None — structured data from APIs
Blog content	No	None — Markdown authoring
Email processing	Minimal	Could extract PDF attachments, but rare
Agent coordination	No	Text-based already

Potential Future Workflows — High Match

Use Case	Value	Priority
Document archive	Convert scanned family docs → searchable Markdown	Medium
Receipt/expense extraction	Table extraction from photos	Medium
Historical photo OCR	Extract text from old scanned photos	Low
Kids' homework help	Math problem extraction from photos	Low
Mail digitization	Scan postal mail → structured data	Low

Integration Path (If Pursued)

Option 1: Hosted API (Immediate)

Effort: Minimal — REST API calls
Cost: ~$0.01/page at volume
Pros: No infrastructure, best accuracy
Cons: External dependency, ongoing cost, data leaves home

Option 2: Local vLLM (Beelink insufficient)

Effort: High — requires GPU server setup
Hardware: Need dedicated inference box (H100 or 2x A6000)
Cost: $3K–$15K hardware or cloud GPU
Pros: Sovereign, no external data
Cons: Capital expense, power, maintenance

Option 3: Gaming PC (RTX 3080 Ti) — Marginal

Effort: Medium — HuggingFace backend
Throughput: ~0.3 pages/sec (estimated)
Pros: Uses existing hardware
Cons: Slow for batch jobs, blocks gaming use

Competitive Landscape

Model	Accuracy	Speed	Cost	Self-Hosted
Chandra 2	⭐⭐⭐⭐⭐	⭐⭐⭐	$$$	✅
olmOCR 2	⭐⭐⭐⭐	⭐⭐⭐⭐	Free	✅
Mistral OCR	⭐⭐⭐	⭐⭐⭐⭐⭐	$	❌
GPT-4o Vision	⭐⭐⭐⭐	⭐⭐⭐⭐⭐	$$	❌
Tesseract	⭐⭐	⭐⭐⭐⭐⭐	Free	✅

Chandra wins on accuracy for complex documents. olmOCR 2 is the open-source alternative with better speed/accuracy tradeoff for simpler docs.

Recommendation

Defer implementation.

Chandra 2.0 is impressive technology without a clear immediate use case for HoffDesk. Current workflows don't involve document scanning or complex OCR needs.

Revisit when:
- Document archive workflow becomes priority
- Receipt/expense tracking enters roadmap
- Kids start bringing home complex math worksheets requiring digitization

If revisiting:
1. Start with hosted API for evaluation
2. Measure actual usage patterns
3. Justify local GPU investment only if volume warrants

Quick Test (Optional)

If curious, can install on Gaming PC for single-page tests:

pip install chandra-ocr[hf]
chandra ~/test-receipt.jpg ./output --method hf

This validates quality without infrastructure commitment.

Assessment complete. Technology filed for future reference.

← Back