`numberlink.env`¶

NumberLink environment implementation.

Provide the numberlink.env.NumberLinkRGBEnv environment, a Gymnasium-compatible RGB NumberLink environment. Observations are RGB images with shape (H, W, 3) and dtype==numpy.uint8. The discrete action space encodes a color, one of two heads for that color, and a direction. When the numberlink.config.VariantConfig.cell_switching_mode variant is enabled, actions encode a (row, col, color) assignment.

See numberlink.level_setup for level construction utilities and numberlink.config for configuration structures referenced by this module.

Added in version 1.0.

Note

When Sphinx renders documentation with autodoc_typehints enabled, type information is taken from function signatures and class attributes.

Module Contents¶

Classes¶

NumberLinkRGBEnv

NumberLink environment.

API¶

class numberlink.env.NumberLinkRGBEnv(grid: collections.abc.Sequence[str] | None = None, *, render_mode: RenderMode | None = None, level_id: str | None = None, variant: numberlink.config.VariantConfig | None = None, bridges: collections.abc.Iterable[Coord] | None = None, generator: numberlink.config.GeneratorConfig | None = None, reward_config: numberlink.config.RewardConfig | None = None, render_config: numberlink.config.RenderConfig | None = None, step_limit: int | None = None, palette: dict[str, RGBInt] | None = None, solution: list[list[Coord]] | None = None)[source]¶

Bases: gymnasium.Env[numberlink.types.ObsType, numberlink.types.ActType]

NumberLink environment.

Expose observations as RGB arrays with shape (H, W, 3) and dtype==numpy.uint8. Action encoding depends on the active variant. In path-building mode actions encode (color, head, direction). In cell-switching mode actions encode a (row, col, color) assignment.

The environment implements bridge cells with independent lanes, optional diagonal movement, and an optional requirement that all cells be filled for a solution to be considered valid.

Interact with the environment via numberlink.env.NumberLinkRGBEnv.reset(), numberlink.env.NumberLinkRGBEnv.step(), and numberlink.env.NumberLinkRGBEnv.render().

Internal NumPy arrays expose state for read only access. Do not mutate arrays returned by public methods.

Initialization

Initialize the environment.

Load the provided grid or generate a level using the provided generator configuration and apply any configuration overrides passed to this constructor. The environment uses explicit configuration values when corresponding overrides are supplied.

Parameters:

grid (Sequence[str] or None) – Optional iterable of strings representing the level grid rows.
render_mode (RenderMode or None) – Optional render mode forwarded to numberlink.env.NumberLinkRGBEnv.render().
level_id (str or None) – Optional identifier for a predefined level.
variant (VariantConfig or None) – Optional numberlink.config.VariantConfig to override template variant.
bridges (Iterable[Coord] or None) – Optional iterable of bridge coordinates as (row, col) pairs.
generator (GeneratorConfig or None) – Optional numberlink.config.GeneratorConfig to generate a level.
reward_config (RewardConfig or None) – Optional numberlink.config.RewardConfig override.
render_config (RenderConfig or None) – Optional numberlink.config.RenderConfig override.
step_limit (int or None) – Optional maximum number of steps before truncation. If None, a default is used.
palette (dict[str, RGBInt] or None) – Optional mapping of letter to RGB tuple used for rendering.

Raises:

ValueError – If configured render dimensions are smaller than the grid dimensions.
RuntimeError – If the generator configuration is invalid for level generation.

Seealso:

numberlink.level_setup.build_level_template()

metadata: dict[str, list[str] | int] = None[source]¶

_DIRS4: numpy.typing.NDArray[numpy.int8] = 'array(...)'[source]¶

_DIRS8: numpy.typing.NDArray[numpy.int8] = 'array(...)'[source]¶

_apply_template(template: numberlink.level_setup.LevelTemplate) → None[source]¶: Load derived attributes from a precomputed template.

_compute_solution_actions() → list[ActType] | None[source]¶

Convert a coordinate-based solution to a sequence of action indices.

Return None if no coordinate solution is available or when the active variant uses numberlink.config.VariantConfig.cell_switching_mode and conversion is not applicable.

Returns:: List of action indices that solve the puzzle, or None if unavailable.
Return type:: list[ActType] or None

reset(*, seed: int | None = None, options: dict[str, Any] | None = None) → tuple[ObsType, numberlink.env.InfoDict][source]¶

Reset the environment to the initial state.

Parameters:

seed (int or None) – Optional seed used for RNG initialization.
options (dict[str, Any] or None) – Options forwarded to the environment reset. This implementation does not use options.

Returns:

Tuple of the initial observation and an info dictionary.

The info dictionary contains the keys

action_mask (numpy.ndarray): Binary action mask for valid actions

steps (int): Number of steps taken, 0 at reset

connected (numpy.ndarray): Boolean array indicating which colors are connected

level_id (str or None): Identifier of the loaded level when available

solved (bool): False at reset unless the level is trivially solved

deadlocked (bool): Whether the state is deadlocked

Return type:: tuple[numpy.ndarray, dict]

Note

This method calls numberlink.env.NumberLinkRGBEnv.render() when render_mode is set to 'ansi' or 'human'.

step(action: ActType) → tuple[ObsType, float, bool, bool, numberlink.env.InfoDict][source]¶

Apply an action and advance the environment state by one step.

Decode and apply action, update internal state, compute the reward for the step, and return the next observation and step information.

Parameters:

action (ActType) – Encoded action in the discrete action space

Returns:

Tuple (observation, reward, terminated, truncated, info) where the components are:

observation: the new observation frame
reward: the scalar reward for the step
terminated: whether the episode ended due to success or deadlock
truncated: whether the episode ended due to reaching max_steps
info: a dictionary with keys described in numberlink.env.NumberLinkRGBEnv.reset()

Return type:

tuple[ObsType, float, bool, bool, InfoDict]

Note

terminated is True when the puzzle is solved or when the environment is deadlocked and cannot continue.

Note

truncated is True when the environment reached max_steps.

render() → gymnasium.core.RenderFrame | list[gymnasium.core.RenderFrame] | None[source]¶

Render the current environment state according to render_mode.

When render_mode is 'rgb_array' return an image frame of type numpy.ndarray. When render_mode is 'ansi' or 'human' return a textual representation. When render_mode is 'human' and numberlink.config.RenderConfig.print_text_in_human_mode is True, also print the textual representation to standard output.

Returns:: A frame or textual rendering depending on the configured mode, or None when rendering is disabled
Return type:: RenderFrame or list[RenderFrame] or None

close() → None[source]¶

Close the environment and release any acquired resources.

This method is present for API compatibility with Gymnasium. Implementations may be a no op for in-memory stateful environments.

get_solution() → list[ActType] | None[source]¶

Return the solution action sequence for the current level when available.

The returned list contains action indices that, when applied in sequence via numberlink.env.NumberLinkRGBEnv.step(), will produce a solved state. Return None if no solution is available.

For path-building variants actions encode color, head, and direction. For cell-switching variants actions encode cell assignments.

Returns:: List of action indices that solve the puzzle, or None if unavailable
Return type:: list[ActType] or None

regenerate_level(seed: int | None = None) → tuple[ObsType, numberlink.env.InfoDict][source]¶

Generate a new level using the stored generator configuration and reset the environment.

Parameters:: seed (int or None) – Optional seed forwarded to the generator and used by numberlink.env.NumberLinkRGBEnv.reset().
Returns:: Observation and info dictionary produced by numberlink.env.NumberLinkRGBEnv.reset()
Return type:: tuple[ObsType, InfoDict]
Raises:: RuntimeError – If the environment was not created with a generator configuration

_metric(a: Coord, b: Coord) → int[source]¶

Compute the distance between two grid coordinates using the active metric.

Use Chebyshev distance when diagonal moves are allowed, otherwise use Manhattan distance.

Parameters:

a (Coord) – First coordinate as (row, col)
b (Coord) – Second coordinate as (row, col)

Returns:

Distance between the two coordinates

Return type:

int

_decode_action(a: int) → RGBInt[source]¶

Decode a packed action index into (color_index, head_index, direction).

Return (-1, -1, -1) when the supplied index is out of range.

Parameters:: a (int) – Encoded action index
Returns:: Decoded components as integers where each value is non-negative, or -1 for invalid inputs
Return type:: RGBInt

_decode_cell_switching_action(a: int) → RGBInt[source]¶

Decode a cell-switching action into (row, col, color_value).

Return (-1, -1, -1) when the action index is invalid. Color value 0 means clear the cell. Values 1..N map to color indices 0..N-1.

Parameters:: a (int (-1, -1, -1) if invalid) – Encoded cell switching action index
Return type:: RGBInt

encode_cell_switching_action(row: int, col: int, color_value: int) → int[source]¶

Encode a (row, col, color_value) triple into a discrete action index.

Color value 0 clears the cell. Color values 1..N map to color indices 1..N.

Parameters:

row (int) – Row index of the target cell
col (int) – Column index of the target cell
color_value (int) – Color value to assign (0 means clear, 1..N are colors)

Returns:

Encoded action index suitable for numberlink.env.NumberLinkRGBEnv.step()

Return type:

int

Raises:

ValueError – If any component is out of the valid ranges

_apply_action(a: int) → bool[source]¶

Apply an encoded action according to the active environment mode.

Dispatch to numberlink.env.NumberLinkRGBEnv._apply_cell_switching_action() when the environment is in cell-switching mode. Otherwise dispatch to numberlink.env.NumberLinkRGBEnv._apply_path_action().

Parameters:: a (int) – Encoded action index
Returns:: True if the action modified the grid, False for invalid or no-op actions
Return type:: bool

_apply_cell_switching_action(a: int) → bool[source]¶

Apply a cell-switching action that assigns or clears a non-endpoint cell.

Decode a into a target cell and a color value where 0 means clear. Prevent changes to endpoint cells and update per-color connectivity flags for any affected colors.

Parameters:: a (int) – Encoded cell switching action index
Returns:: True when the action modified the grid, otherwise False for invalid actions
Return type:: bool

_refresh_cell_switch_connections(colors: collections.abc.Iterable[int]) → None[source]¶

Recompute the _closed connection flags for a subset of colors.

Only valid color indices within the range 0..num_colors-1 are considered. Use this method to avoid recomputing global connectivity after a local cell change.

Parameters:: colors (Iterable[int]) – Iterable of zero-based color indices to refresh

_is_color_connected(ci: int) → bool[source]¶: Return whether both endpoints of color ci are connected by its current path.

_cell_switch_connected(ci: int) → bool[source]¶: Return whether a color’s endpoints are connected in cell-switching mode.

_apply_path_action(a: int) → bool[source]¶

Apply a path-building action for a color arm head.

Decode the action to (color_index, head_index, direction). Valid moves include stepping into an empty cell, backtracking onto the previous cell of the same arm, joining the other head, or occupying an untouched endpoint. Bridge lane semantics are enforced when moving onto bridge cells.

Parameters:: a (int) – Encoded action index
Returns:: True if the action changed the board, otherwise False for invalid or blocked moves
Return type:: bool

_is_solved() → bool[source]¶

Return whether the current environment state is a solved puzzle.

In cell-switching mode this method delegates to numberlink.env.NumberLinkRGBEnv._is_valid_solution(). In standard path-building mode all colors must be connected. When the active variant requires full coverage all cells must also be filled.

Returns:: True if the puzzle is solved, otherwise False
Return type:: bool

static _is_deadlocked(action_mask: numpy.typing.NDArray[numpy.uint8], solved: bool) → bool[source]¶

Determine whether the environment is deadlocked.

The environment is deadlocked when there are no available actions and the puzzle is not already solved.

Parameters:

action_mask (NDArray[np.uint8]) – Binary action mask where nonzero entries are valid
solved (bool) – Whether the puzzle is already solved

Returns:

True when deadlocked, otherwise False

Return type:

bool

_all_filled() → bool[source]¶

Return whether all cells are occupied according to environment rules.

For non-bridge cells the grid code must be nonzero. For bridge cells at least one lane must be occupied.

Returns:: True when every cell is occupied, otherwise False
Return type:: bool

_is_valid_solution() → bool[source]¶

Return whether the current state is a valid NumberLink solution.

Validation rules

Endpoints of each color must be connected by a continuous path of that color
Paths must not branch or contain loops
If the active variant requires full coverage, every cell must be filled

Returns:: True when the current state meets all validity rules, otherwise False
Return type:: bool

_cell_has_color(cell: Coord, color_code: int) → bool[source]¶

Return whether the given grid cell currently contains color_code.

For bridge cells check both vertical and horizontal lanes for the color.

Parameters:

cell (Coord) – Coordinate as (row, col)
color_code (int) – One-based color code stored in the grid arrays

Returns:

True when the cell contains the color, otherwise False

Return type:

bool

_can_occupy(cell: Coord, color_code: int, *, lane: Lane) → bool[source]¶

Test whether a color can occupy a target cell for a specific lane.

For non-bridge cells the cell must be empty. For bridge cells the requested lane must be empty or already occupied by the same color.

Parameters:

cell (Coord) – Target coordinate as (row, col)
color_code (int) – One-based color code
lane (Lane) – Lane specifier, one of 'v', 'h', or 'n' for non-bridge

Returns:

True if the occupancy is allowed, otherwise False

Return type:

bool

_occupy_cell(cell: Coord, color_code: int, *, lane: Lane) → None[source]¶

Occupy a cell with the provided color code for the specified lane.

For endpoints on bridge cells the lane argument may be 'n' to indicate both lanes should be marked.

Parameters:

cell (Coord) – Target coordinate as (row, col)
color_code (int) – One-based color code to write into the grid
lane (Lane) – Lane specifier indicating which lane to set

_mark_presence(ci: int, hi: int, cell: Coord) → None[source]¶: Mark a coordinate as occupied in the per-arm presence grid.

_clear_presence(ci: int, hi: int, cell: Coord) → None[source]¶: Clear a coordinate from the per-arm presence grid.

_erase_last(ci: int, hi: int) → None[source]¶

Erase the last placed segment for a color arm.

Pop the last entry from the specified arm stack and clear the corresponding grid or lane value. Protect the starting endpoint for the specified head so it is not removed.

Parameters:

ci (int) – Zero-based color index
hi (int) – Head index for the arm, typically 0 or 1

clear_color_path(ci: int) → None[source]¶

Clear all non-endpoint segments for color ci and reset its connection flag.

This utility is used by the interactive viewer when switching the active drawing endpoint. Remove every non-endpoint segment for both arms while leaving the original endpoints intact.

Parameters:

ci (int) – Zero-based color index

Raises:

ValueError – If ci is outside the valid range
RuntimeError – If invoked while the environment uses cell-switching mode

_update_connection_status(ci: int) → None[source]¶

Update the per-color connection flag by inspecting both arm heads.

A color is connected when both heads share the same position or when one head has reached the opposite endpoint and that endpoint’s stack length is greater than one.

Parameters:: ci (int) – Zero-based color index

_compute_action_mask() → numpy.typing.NDArray[numpy.uint8][source]¶

Compute a binary action mask of valid actions for the current state.

Delegate to numberlink.env.NumberLinkRGBEnv._compute_cell_switching_mask() or numberlink.env.NumberLinkRGBEnv._compute_path_mask() depending on the active variant.

Returns:: One-dimensional array where nonzero entries mark valid actions
Return type:: NDArray[np.uint8]

_compute_cell_switching_mask() → numpy.typing.NDArray[numpy.uint8][source]¶

Compute the action mask for cell-switching mode.

If _cell_switch_mask is None return a zero mask covering the full action space. Otherwise return a copy of the configured mask.

Returns:: Action mask for the cell-switching action space
Return type:: NDArray[np.uint8]

_compute_path_mask() → numpy.typing.NDArray[numpy.uint8][source]¶

Compute the action mask for standard path-building mode.

The mask marks legal head moves for every color and arm head. Legal moves include backtracking, meeting the other head, stepping onto an untouched endpoint, and moving into an empty lane or cell while respecting bridge semantics.

Returns:: One-dimensional binary mask for the discrete action space
Return type:: NDArray[np.uint8]

_render_rgb() → ObsType[source]¶

Produce an RGB image representing the current grid state.

Normal cells are filled from the palette based on _grid_codes. Bridge cells are composed by mixing lane colors when both lanes are present. Apply endpoint styling according to the active render configuration.

Returns:: RGB image with shape (H, W, 3) and dtype uint8
Return type:: ObsType

_render_text() → str[source]¶

Produce a compact textual representation of the grid for terminal output.

The header includes step count and the number of connected colors. The grid uses these symbols

. for empty regular cells
Uppercase letter for endpoints
Lowercase letter for path segments
For bridge cells the first character represents the vertical lane and the second the horizontal lane. Use
the same casing rules, or * when the lane is empty

Returns:: Multi-line string representing the grid and status line
Return type:: str

render_mode: str | None = None[source]¶

spec: gymnasium.envs.registration.EnvSpec | None = None[source]¶

action_space: gymnasium.spaces.Space[gymnasium.core.ActType] = None[source]¶

observation_space: gymnasium.spaces.Space[gymnasium.core.ObsType] = None[source]¶

_np_random: numpy.random.Generator | None = None[source]¶

_np_random_seed: int | None = None[source]¶

property unwrapped: gymnasium.core.Env[gymnasium.core.ObsType, gymnasium.core.ActType][source]¶

property np_random_seed: int[source]¶

property np_random: numpy.random.Generator[source]¶

__str__()[source]¶

__enter__()[source]¶

__exit__(*args: Any)[source]¶

has_wrapper_attr(name: str) → bool[source]¶

get_wrapper_attr(name: str) → Any[source]¶

set_wrapper_attr(name: str, value: Any, *, force: bool = True) → bool[source]¶

classmethod __class_getitem__(item)[source]¶

numberlink.env¶

Module Contents¶

Classes¶

API¶

`numberlink.env`¶