Experiments — Shadow Mode Testing
This directory is for parallel testing of novel architectures, alternative
pipelines, and LLM swaps. Experiments live alongside production code but
never touch the live data directory (~/.icarus/).
Rules
-
Same interface. Every experiment must present the same class/data
contract as the production module it shadows. No schema drift. -
Eval mode. Each experiment has a
compare_to_production()method
that runs both implementations and logs diffs. Results go to
experiments/<name>/eval_results/. -
Config-gated. Experiments are enabled via
config.yamlflags.
Never on by default. Never writing to~/.icarus/. -
No auto-promote. Experiment outputs are for evaluation only.
Promotion to production requires explicit copy + test pass.
Active Experiments
(None currently — add entries here when starting a new experiment)
How to Add a New Experiment
experiments/my-thing/
├── adapter.py ← Drop-in replacement for production module
├── README.md ← What's being tested and how to eval
└── eval_results/ ← Comparison outputs (gitignored)
The adapter must match the production module's return types exactly.
How to Run Comparison Mode
# Run both prod and experiment, log diffs, use prod result
cd ~/icarus && python3 -c "
from experiments.thoth.adapter import ThothExtractor
from icarus.extractor import Extractor
prod = Extractor()
exp = ThothExtractor()
result = exp.compare_to_production('Sully has a dentist appointment at 3pm')
print(result.summary())
"