emic.types¶
Core types for epsilon-machines.
types
¶
Core types for epsilon-machine representation.
Public API
- Symbol (TypeVar)
- Alphabet (Protocol)
- ConcreteAlphabet
- Distribution
- Transition
- CausalState
- EpsilonMachine
- EpsilonMachineBuilder
Alphabet
¶
ConcreteAlphabet
dataclass
¶
Bases: Generic[A]
Immutable alphabet implementation.
A concrete implementation of the Alphabet protocol using a frozenset to store symbols. This is the standard alphabet type for most use cases.
Examples:
__contains__
¶
__iter__
¶
__len__
¶
binary
staticmethod
¶
binary() -> ConcreteAlphabet[int]
Create binary alphabet {0, 1}.
Returns:
| Type | Description |
|---|---|
ConcreteAlphabet[int]
|
An alphabet containing exactly the integers 0 and 1. |
Examples:
Source code in src/emic/types/alphabet.py
from_symbols
classmethod
¶
from_symbols(*symbols: A) -> ConcreteAlphabet[A]
Create alphabet from symbols.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
*symbols
|
A
|
Variable number of hashable symbols. |
()
|
Returns:
| Type | Description |
|---|---|
ConcreteAlphabet[A]
|
An alphabet containing the given symbols. |
Examples:
Source code in src/emic/types/alphabet.py
Distribution
dataclass
¶
Distribution(_probs: Mapping[A, ProbabilityValue])
Bases: Generic[A]
An immutable probability distribution over symbols.
Invariants
- All probabilities are in [0, 1]
- Probabilities sum to 1 (within tolerance)
- Only non-zero probabilities are stored
Examples:
>>> dist = Distribution({'a': 0.7, 'b': 0.3})
>>> dist['a']
0.7
>>> dist['c'] # Not in distribution
0.0
>>> dist.entropy()
0.881...
probs
property
¶
probs: Mapping[A, ProbabilityValue]
The underlying probability mapping (read-only).
__post_init__
¶
Validate probabilities sum to 1 and are in valid range.
Source code in src/emic/types/probability.py
__getitem__
¶
__getitem__(symbol: A) -> ProbabilityValue
__iter__
¶
__len__
¶
entropy
¶
Shannon entropy of the distribution in bits.
Returns:
| Type | Description |
|---|---|
float
|
The entropy H = -Σ p(x) log₂ p(x) |
Examples:
>>> Distribution.uniform(frozenset({0, 1})).entropy()
1.0
>>> abs(Distribution.deterministic('a').entropy()) < 1e-10
True
Source code in src/emic/types/probability.py
uniform
classmethod
¶
Create uniform distribution over symbols.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
frozenset[A]
|
Set of symbols to distribute probability over. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
A distribution with equal probability for each symbol. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If symbols is empty. |
Examples:
Source code in src/emic/types/probability.py
deterministic
classmethod
¶
Create distribution with all mass on one symbol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
A
|
The symbol to assign probability 1.0. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
A distribution where only the given symbol has non-zero probability. |
Examples:
Source code in src/emic/types/probability.py
Transition
dataclass
¶
Transition(
symbol: A,
probability: ProbabilityValue,
target: StateId,
)
Bases: Generic[A]
A labeled transition from one state to another.
Represents: "On symbol symbol, go to target with probability probability"
Attributes:
| Name | Type | Description |
|---|---|---|
symbol |
A
|
The emitted symbol for this transition. |
probability |
ProbabilityValue
|
The probability of taking this transition (must be in (0, 1]). |
target |
StateId
|
The ID of the target state. |
Examples:
__post_init__
¶
Validate that probability is in valid range.
CausalState
dataclass
¶
CausalState(
id: StateId, transitions: frozenset[Transition[A]]
)
Bases: Generic[A]
A causal state in an epsilon-machine.
A causal state is an equivalence class of histories that induce the same conditional distribution over futures.
Attributes:
| Name | Type | Description |
|---|---|---|
id |
StateId
|
Unique identifier for this state. |
transitions |
frozenset[Transition[A]]
|
Set of outgoing transitions from this state. |
Examples:
>>> t1 = Transition(symbol=0, probability=0.5, target='A')
>>> t2 = Transition(symbol=1, probability=0.5, target='B')
>>> state = CausalState(id='A', transitions=frozenset({t1, t2}))
>>> state.alphabet
frozenset({0, 1})
__post_init__
¶
Validate transition probabilities for each symbol sum to <= 1.
Source code in src/emic/types/states.py
transition_distribution
¶
transition_distribution(
symbol: A,
) -> Distribution[StateId]
Get the distribution over next states given a symbol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
A
|
The input symbol. |
required |
Returns:
| Type | Description |
|---|---|
Distribution[StateId]
|
A distribution over target state IDs. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If symbol has no transitions from this state. |
Examples:
>>> t = Transition(symbol=0, probability=1.0, target='B')
>>> state = CausalState(id='A', transitions=frozenset({t}))
>>> dist = state.transition_distribution(0)
>>> dist['B']
1.0
Source code in src/emic/types/states.py
emission_distribution
¶
emission_distribution() -> Distribution[A]
Get the distribution over emitted symbols from this state.
Note: This assumes the machine is "edge-emitting" (symbols on transitions).
Returns:
| Type | Description |
|---|---|
Distribution[A]
|
A distribution over symbols that can be emitted from this state. |
Examples:
>>> t1 = Transition(symbol=0, probability=0.5, target='A')
>>> t2 = Transition(symbol=1, probability=0.5, target='B')
>>> state = CausalState(id='A', transitions=frozenset({t1, t2}))
>>> dist = state.emission_distribution()
>>> dist[0]
0.5
Source code in src/emic/types/states.py
next_states
¶
next_states(symbol: A) -> frozenset[StateId]
Get possible next states given a symbol.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
A
|
The input symbol. |
required |
Returns:
| Type | Description |
|---|---|
frozenset[StateId]
|
Set of state IDs that can be reached with this symbol. |
Examples:
>>> t = Transition(symbol=0, probability=1.0, target='B')
>>> state = CausalState(id='A', transitions=frozenset({t}))
>>> state.next_states(0)
frozenset({'B'})
Source code in src/emic/types/states.py
EpsilonMachine
dataclass
¶
EpsilonMachine(
alphabet: frozenset[A],
states: frozenset[CausalState[A]],
start_state: StateId,
stationary_distribution: Distribution[StateId],
)
Bases: Generic[A]
An epsilon-machine (ε-machine) over alphabet A.
An ε-machine is the minimal, optimal predictor of a stationary stochastic process. It consists of: - A finite set of causal states - Labeled transitions between states - A stationary distribution over states
Properties: - Unifilarity: Each (state, symbol) pair has at most one outgoing transition - Minimality: No two states have the same conditional future distribution
This class represents the result of inference - the discovered structure.
Attributes:
| Name | Type | Description |
|---|---|---|
alphabet |
frozenset[A]
|
The set of symbols used by this machine. |
states |
frozenset[CausalState[A]]
|
The set of causal states. |
start_state |
StateId
|
The ID of the initial state. |
stationary_distribution |
Distribution[StateId]
|
The steady-state distribution over states. |
Examples:
>>> from emic.types import EpsilonMachineBuilder
>>> machine = (
... EpsilonMachineBuilder[int]()
... .add_transition("A", 0, "A", 0.5)
... .add_transition("A", 1, "B", 0.5)
... .add_transition("B", 0, "A", 1.0)
... .with_start_state("A")
... .build()
... )
>>> len(machine)
2
>>> machine.is_unifilar()
True
__post_init__
¶
Validate machine invariants.
Source code in src/emic/types/machine.py
__len__
¶
get_state
¶
get_state(state_id: StateId) -> CausalState[A]
Get a state by ID.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_id
|
StateId
|
The ID of the state to retrieve. |
required |
Returns:
| Type | Description |
|---|---|
CausalState[A]
|
The CausalState with the given ID. |
Raises:
| Type | Description |
|---|---|
KeyError
|
If state not found. |
Examples:
>>> from emic.types import EpsilonMachineBuilder
>>> machine = (
... EpsilonMachineBuilder[int]()
... .add_transition("A", 0, "A", 1.0)
... .with_start_state("A")
... .build()
... )
>>> state = machine.get_state("A")
>>> state.id
'A'
Source code in src/emic/types/machine.py
transition_matrix
¶
transition_matrix(
symbol: A,
) -> dict[StateId, Distribution[StateId]]
Get the transition matrix for a given symbol.
Returns a mapping from source state to distribution over target states. Only states that have transitions for the given symbol are included.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbol
|
A
|
The symbol to get transitions for. |
required |
Returns:
| Type | Description |
|---|---|
dict[StateId, Distribution[StateId]]
|
A dict mapping state IDs to distributions over next states. |
Examples:
>>> from emic.types import EpsilonMachineBuilder
>>> machine = (
... EpsilonMachineBuilder[int]()
... .add_transition("A", 0, "B", 1.0)
... .add_transition("B", 0, "A", 1.0)
... .with_start_state("A")
... .build()
... )
>>> matrix = machine.transition_matrix(0)
>>> matrix["A"]["B"]
1.0
Source code in src/emic/types/machine.py
is_unifilar
¶
Check if machine is unifilar (deterministic given state and symbol).
A machine is unifilar if, for each state, each symbol leads to at most one target state. This is a fundamental property of epsilon-machines.
Returns:
| Type | Description |
|---|---|
bool
|
True if the machine is unifilar, False otherwise. |
Examples:
>>> from emic.types import EpsilonMachineBuilder
>>> machine = (
... EpsilonMachineBuilder[int]()
... .add_transition("A", 0, "A", 1.0)
... .with_start_state("A")
... .build()
... )
>>> machine.is_unifilar()
True
Source code in src/emic/types/machine.py
is_ergodic
¶
Check if machine is ergodic (single recurrent class).
Returns:
| Type | Description |
|---|---|
bool
|
True if the machine is ergodic, False otherwise. |
Raises:
| Type | Description |
|---|---|
NotImplementedError
|
This method is not yet implemented. |
Source code in src/emic/types/machine.py
EpsilonMachineBuilder
¶
Bases: Generic[A]
Mutable builder for constructing EpsilonMachine instances.
Since EpsilonMachine is immutable, this builder provides an ergonomic way to construct machines incrementally.
Examples:
>>> machine = (
... EpsilonMachineBuilder[int]()
... .with_alphabet({0, 1})
... .add_state("A")
... .add_state("B")
... .add_transition("A", 0, "A", 0.5)
... .add_transition("A", 1, "B", 0.5)
... .add_transition("B", 0, "A", 1.0)
... .with_start_state("A")
... .build()
... )
>>> len(machine)
2
Initialize an empty builder.
Source code in src/emic/types/machine.py
with_alphabet
¶
Set the alphabet for the machine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
symbols
|
set[A]
|
The set of symbols to use. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
add_state
¶
add_state(state_id: StateId) -> Self
Add a state to the machine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_id
|
StateId
|
The ID of the state to add. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in src/emic/types/machine.py
add_transition
¶
Add a transition to the machine.
This will also add the source and target states if they don't exist, and add the symbol to the alphabet.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
source
|
StateId
|
The source state ID. |
required |
symbol
|
A
|
The symbol emitted on this transition. |
required |
target
|
StateId
|
The target state ID. |
required |
probability
|
float
|
The probability of this transition. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in src/emic/types/machine.py
with_start_state
¶
with_start_state(state_id: StateId) -> Self
Set the start state for the machine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
state_id
|
StateId
|
The ID of the start state. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in src/emic/types/machine.py
with_stationary_distribution
¶
with_stationary_distribution(
dist: dict[StateId, float],
) -> Self
Set the stationary distribution for the machine.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dist
|
dict[StateId, float]
|
A mapping from state IDs to probabilities. |
required |
Returns:
| Type | Description |
|---|---|
Self
|
Self for method chaining. |
Source code in src/emic/types/machine.py
build
¶
build() -> EpsilonMachine[A]
Build the EpsilonMachine.
Returns:
| Type | Description |
|---|---|
EpsilonMachine[A]
|
The constructed EpsilonMachine. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If start state is not set. |