Experiments¶

The emic.experiments module provides a framework for running reproducible experiments to evaluate inference algorithms against known ground truth.

Overview¶

The experimentation framework supports:

Multiple algorithms: CSSR, Spectral, CSM, BSI
Canonical processes: Even Process, Golden Mean, Biased Coin
Experiment types: Accuracy, convergence, scalability
Parallel execution: Run experiments across multiple workers
Structured output: JSON/Parquet results with metadata

Quick Start¶

Command-Line Interface¶

# Run all experiments
emic-experiment --all

# Run specific experiment
emic-experiment accuracy

# Parallel execution (4 workers)
emic-experiment --all --parallel 4

# Quick mode (skip slow algorithms, reduced sample sizes)
emic-experiment --quick

# Run only specific algorithms
emic-experiment accuracy --algorithms cssr,spectral

# List available experiments
emic-experiment --list

Programmatic Usage¶

from emic.experiments import ExperimentRunner
from emic.experiments.config import DEFAULT_EXPERIMENTS

runner = ExperimentRunner()
results = runner.run_experiment(DEFAULT_EXPERIMENTS["accuracy"])

Experiment Types¶

Accuracy¶

Measures how well each algorithm recovers the true number of states and complexity measures on canonical processes.

Configuration: - Algorithms: CSSR, Spectral, CSM, BSI - Processes: Even Process, Golden Mean, Biased Coin - Sample sizes: 1000, 5000, 10000 - Repetitions: 1

Metrics: - state_count: Number of inferred states - cmu: Statistical complexity - hmu: Entropy rate - duration_s: Inference time

Convergence¶

Measures how accuracy improves with increasing data size.

Configuration: - Algorithms: CSSR, Spectral, CSM, BSI - Processes: Even Process, Golden Mean - Sample sizes: 100, 500, 1000, 2000, 5000, 10000, 20000 - Repetitions: 5

Scalability¶

Measures runtime scaling with data size.

Configuration: - Algorithms: CSSR, Spectral, CSM, BSI - Processes: Even Process - Sample sizes: 1000, 2000, 5000, 10000, 20000, 50000 - Repetitions: 3

Results¶

Latest Results (January 2026)¶

Algorithm Accuracy (State Count)¶

Algorithm	Even Process	Golden Mean	Biased Coin	Periodic	Overall
Spectral	100%	100%	80%	100%	85%
CSSR	20%	100%	100%	100%	82%
NSD	100%	100%	100%	0%	73%
CSM	0%	80%	60%	0%	39%
BSI	0%	20%	80%	20%	32%

Key Observations

Spectral achieves 100% accuracy at N ≥ 10,000 on all processes
CSSR excels on most processes but struggles with Even Process at large N (over-splits to 4 states)
NSD fails on deterministic Periodic processes
CSM and BSI have lower accuracy overall

Statistical Complexity Error (Mean |Cμ - true|)¶

Algorithm	Mean Error
CSSR	0.05
CSM	0.10
BSI	0.53
Spectral	0.15

Convergence by Sample Size¶

N	Correct Rate
100	45%
1,000	70%
10,000	75%
100,000	85%
1,000,000	85%

Interpretation

Accuracy generally improves with sample size
Spectral achieves 100% at N ≥ 10,000 on all processes
CSSR may over-split on Even Process as N grows

Output Format¶

Results are stored in experiments/runs/<timestamp>/:

experiments/runs/2026-01-27T06-35-40/
├── metadata.yaml    # Git commit, config, timing
└── results.json     # All experiment records

Record Schema¶

Field	Type	Description
`experiment`	str	"accuracy", "convergence", "scalability"
`algorithm`	str	"cssr", "spectral", "csm", "bsi"
`process`	str	"even_process", "golden_mean", "biased_coin"
`n_samples`	int	Data length used
`metric`	str	"state_count", "cmu", "hmu", "duration_s"
`value`	float	Measured value
`ground_truth`	float	Expected value (if known)
`error`	str	Exception message (if failed)
`timestamp`	datetime	When recorded

Parallel Execution¶

For large experiments, use parallel execution:

# 4 parallel workers
emic-experiment --all --parallel 4

# Sharded execution (for distributed systems)
emic-experiment --all --shard 1/4  # Run on machine 1
emic-experiment --all --shard 2/4  # Run on machine 2
# ... etc

# Combine sharded results
emic-experiment --combine experiments/runs/<timestamp>/

Custom Configuration¶

Create a YAML config file:

experiments:
  - name: my_experiment
    algorithms: [cssr, spectral]
    processes: [even_process, golden_mean]
    sample_sizes: [1000, 5000, 10000]
    repetitions: 3
    timeout_seconds: 60

output_dir: experiments/runs
quick_sample_sizes: [1000]

Run with:

emic-experiment --config my_config.yaml --all

Adding Custom Experiments¶

Register a Process¶

from emic.experiments import get_process_registry

registry = get_process_registry()
registry.register(
    name="my_process",
    display_name="My Process",
    factory=MyProcessSource,
    parameters={"param": 0.5},
    ground_truth={"state_count": 3, "cmu": 1.5},
)

Register an Algorithm¶

from emic.experiments import get_algorithm_registry

registry = get_algorithm_registry()
registry.register(
    name="my_algo",
    display_name="My Algorithm",
    factory=MyAlgorithm,
    config_class=MyConfig,
    default_config={"max_history": 5},
    slow=False,
)

Best Practices¶

Use --quick for development: Skip slow algorithms and use small sample sizes
Run parallel for production: Use --parallel N for faster execution
Check for errors: Results include an error field for failed runs
Version control results: Commit important experiment results
Use metadata: Each run includes git commit hash for reproducibility