numberlink.env¶
NumberLink environment implementation.
Provide the numberlink.env.NumberLinkRGBEnv environment, a Gymnasium-compatible RGB NumberLink
environment. Observations are RGB images with shape (H, W, 3) and dtype==numpy.uint8. The discrete action
space encodes a color, one of two heads for that color, and a direction. When the
numberlink.config.VariantConfig.cell_switching_mode variant is enabled, actions encode a (row, col, color)
assignment.
See numberlink.level_setup for level construction utilities and numberlink.config for configuration
structures referenced by this module.
Added in version 1.0.
Note
When Sphinx renders documentation with autodoc_typehints enabled, type information is taken from function
signatures and class attributes.
Module Contents¶
Classes¶
NumberLink environment. |
API¶
- class numberlink.env.NumberLinkRGBEnv(grid: collections.abc.Sequence[str] | None = None, *, render_mode: RenderMode | None = None, level_id: str | None = None, variant: numberlink.config.VariantConfig | None = None, bridges: collections.abc.Iterable[Coord] | None = None, generator: numberlink.config.GeneratorConfig | None = None, reward_config: numberlink.config.RewardConfig | None = None, render_config: numberlink.config.RenderConfig | None = None, step_limit: int | None = None, palette: dict[str, RGBInt] | None = None, solution: list[list[Coord]] | None = None)[source]¶
Bases:
gymnasium.Env[numberlink.types.ObsType,numberlink.types.ActType]NumberLink environment.
Expose observations as RGB arrays with shape
(H, W, 3)anddtype==numpy.uint8. Action encoding depends on the active variant. In path-building mode actions encode(color, head, direction). In cell-switching mode actions encode a(row, col, color)assignment.The environment implements bridge cells with independent lanes, optional diagonal movement, and an optional requirement that all cells be filled for a solution to be considered valid.
Interact with the environment via
numberlink.env.NumberLinkRGBEnv.reset(),numberlink.env.NumberLinkRGBEnv.step(), andnumberlink.env.NumberLinkRGBEnv.render().Internal NumPy arrays expose state for read only access. Do not mutate arrays returned by public methods.
Initialization
Initialize the environment.
Load the provided
gridor generate a level using the providedgeneratorconfiguration and apply any configuration overrides passed to this constructor. The environment uses explicit configuration values when corresponding overrides are supplied.- Parameters:
grid (Sequence[str] or None) – Optional iterable of strings representing the level grid rows.
render_mode (RenderMode or None) – Optional render mode forwarded to
numberlink.env.NumberLinkRGBEnv.render().level_id (str or None) – Optional identifier for a predefined level.
variant (VariantConfig or None) – Optional
numberlink.config.VariantConfigto override template variant.bridges (Iterable[Coord] or None) – Optional iterable of bridge coordinates as
(row, col)pairs.generator (GeneratorConfig or None) – Optional
numberlink.config.GeneratorConfigto generate a level.reward_config (RewardConfig or None) – Optional
numberlink.config.RewardConfigoverride.render_config (RenderConfig or None) – Optional
numberlink.config.RenderConfigoverride.step_limit (int or None) – Optional maximum number of steps before truncation. If
None, a default is used.palette (dict[str, RGBInt] or None) – Optional mapping of letter to RGB tuple used for rendering.
- Raises:
ValueError – If configured render dimensions are smaller than the grid dimensions.
RuntimeError – If the generator configuration is invalid for level generation.
- Seealso:
- _apply_template(template: numberlink.level_setup.LevelTemplate) None[source]¶
Load derived attributes from a precomputed template.
- _compute_solution_actions() list[ActType] | None[source]¶
Convert a coordinate-based solution to a sequence of action indices.
Return
Noneif no coordinate solution is available or when the active variant usesnumberlink.config.VariantConfig.cell_switching_modeand conversion is not applicable.- Returns:
List of action indices that solve the puzzle, or
Noneif unavailable.- Return type:
list[ActType] or None
- reset(*, seed: int | None = None, options: dict[str, Any] | None = None) tuple[ObsType, numberlink.env.InfoDict][source]¶
Reset the environment to the initial state.
- Parameters:
- Returns:
Tuple of the initial observation and an info dictionary.
The info dictionary contains the keys
action_mask(numpy.ndarray): Binary action mask for valid actionssteps(int): Number of steps taken,0at resetconnected(numpy.ndarray): Boolean array indicating which colors are connectedlevel_id(strorNone): Identifier of the loaded level when availablesolved(bool):Falseat reset unless the level is trivially solveddeadlocked(bool): Whether the state is deadlocked
- Return type:
tuple[
numpy.ndarray, dict]
Note
This method calls
numberlink.env.NumberLinkRGBEnv.render()whenrender_modeis set to'ansi'or'human'.
- step(action: ActType) tuple[ObsType, float, bool, bool, numberlink.env.InfoDict][source]¶
Apply an action and advance the environment state by one step.
Decode and apply
action, update internal state, compute the reward for the step, and return the next observation and step information.- Parameters:
action (ActType) – Encoded action in the discrete action space
- Returns:
Tuple
(observation, reward, terminated, truncated, info)where the components are:observation: the new observation framereward: the scalar reward for the stepterminated: whether the episode ended due to success or deadlocktruncated: whether the episode ended due to reachingmax_stepsinfo: a dictionary with keys described innumberlink.env.NumberLinkRGBEnv.reset()
- Return type:
Note
terminatedisTruewhen the puzzle is solved or when the environment is deadlocked and cannot continue.Note
truncatedisTruewhen the environment reachedmax_steps.
- render() gymnasium.core.RenderFrame | list[gymnasium.core.RenderFrame] | None[source]¶
Render the current environment state according to
render_mode.When
render_modeis'rgb_array'return an image frame of typenumpy.ndarray. Whenrender_modeis'ansi'or'human'return a textual representation. Whenrender_modeis'human'andnumberlink.config.RenderConfig.print_text_in_human_modeisTrue, also print the textual representation to standard output.- Returns:
A frame or textual rendering depending on the configured mode, or
Nonewhen rendering is disabled- Return type:
RenderFrame or list[RenderFrame] or None
- close() None[source]¶
Close the environment and release any acquired resources.
This method is present for API compatibility with Gymnasium. Implementations may be a no op for in-memory stateful environments.
- get_solution() list[ActType] | None[source]¶
Return the solution action sequence for the current level when available.
The returned list contains action indices that, when applied in sequence via
numberlink.env.NumberLinkRGBEnv.step(), will produce a solved state. ReturnNoneif no solution is available.For path-building variants actions encode color, head, and direction. For cell-switching variants actions encode cell assignments.
- Returns:
List of action indices that solve the puzzle, or
Noneif unavailable- Return type:
list[ActType] or None
- regenerate_level(seed: int | None = None) tuple[ObsType, numberlink.env.InfoDict][source]¶
Generate a new level using the stored generator configuration and reset the environment.
- Parameters:
seed (int or None) – Optional seed forwarded to the generator and used by
numberlink.env.NumberLinkRGBEnv.reset().- Returns:
Observation and info dictionary produced by
numberlink.env.NumberLinkRGBEnv.reset()- Return type:
tuple[ObsType, InfoDict]
- Raises:
RuntimeError – If the environment was not created with a generator configuration
- _metric(a: Coord, b: Coord) int[source]¶
Compute the distance between two grid coordinates using the active metric.
Use Chebyshev distance when diagonal moves are allowed, otherwise use Manhattan distance.
- Parameters:
a (Coord) – First coordinate as
(row, col)b (Coord) – Second coordinate as
(row, col)
- Returns:
Distance between the two coordinates
- Return type:
- _decode_action(a: int) RGBInt[source]¶
Decode a packed action index into
(color_index, head_index, direction).Return
(-1, -1, -1)when the supplied index is out of range.- Parameters:
a (int) – Encoded action index
- Returns:
Decoded components as integers where each value is non-negative, or
-1for invalid inputs- Return type:
RGBInt
- _decode_cell_switching_action(a: int) RGBInt[source]¶
Decode a cell-switching action into
(row, col, color_value).Return
(-1, -1, -1)when the action index is invalid. Color value0means clear the cell. Values1..Nmap to color indices0..N-1.- Parameters:
a (int
(-1, -1, -1)if invalid) – Encoded cell switching action index- Return type:
RGBInt
- encode_cell_switching_action(row: int, col: int, color_value: int) int[source]¶
Encode a
(row, col, color_value)triple into a discrete action index.Color value
0clears the cell. Color values1..Nmap to color indices1..N.- Parameters:
- Returns:
Encoded action index suitable for
numberlink.env.NumberLinkRGBEnv.step()- Return type:
- Raises:
ValueError – If any component is out of the valid ranges
- _apply_action(a: int) bool[source]¶
Apply an encoded action according to the active environment mode.
Dispatch to
numberlink.env.NumberLinkRGBEnv._apply_cell_switching_action()when the environment is in cell-switching mode. Otherwise dispatch tonumberlink.env.NumberLinkRGBEnv._apply_path_action().
- _apply_cell_switching_action(a: int) bool[source]¶
Apply a cell-switching action that assigns or clears a non-endpoint cell.
Decode
ainto a target cell and a color value where0means clear. Prevent changes to endpoint cells and update per-color connectivity flags for any affected colors.
- _refresh_cell_switch_connections(colors: collections.abc.Iterable[int]) None[source]¶
Recompute the
_closedconnection flags for a subset of colors.Only valid color indices within the range
0..num_colors-1are considered. Use this method to avoid recomputing global connectivity after a local cell change.- Parameters:
colors (Iterable[int]) – Iterable of zero-based color indices to refresh
- _is_color_connected(ci: int) bool[source]¶
Return whether both endpoints of color
ciare connected by its current path.
- _cell_switch_connected(ci: int) bool[source]¶
Return whether a color’s endpoints are connected in cell-switching mode.
- _apply_path_action(a: int) bool[source]¶
Apply a path-building action for a color arm head.
Decode the action to
(color_index, head_index, direction). Valid moves include stepping into an empty cell, backtracking onto the previous cell of the same arm, joining the other head, or occupying an untouched endpoint. Bridge lane semantics are enforced when moving onto bridge cells.
- _is_solved() bool[source]¶
Return whether the current environment state is a solved puzzle.
In cell-switching mode this method delegates to
numberlink.env.NumberLinkRGBEnv._is_valid_solution(). In standard path-building mode all colors must be connected. When the active variant requires full coverage all cells must also be filled.- Returns:
Trueif the puzzle is solved, otherwiseFalse- Return type:
- static _is_deadlocked(action_mask: numpy.typing.NDArray[numpy.uint8], solved: bool) bool[source]¶
Determine whether the environment is deadlocked.
The environment is deadlocked when there are no available actions and the puzzle is not already solved.
- _all_filled() bool[source]¶
Return whether all cells are occupied according to environment rules.
For non-bridge cells the grid code must be nonzero. For bridge cells at least one lane must be occupied.
- Returns:
Truewhen every cell is occupied, otherwiseFalse- Return type:
- _is_valid_solution() bool[source]¶
Return whether the current state is a valid NumberLink solution.
Validation rules
Endpoints of each color must be connected by a continuous path of that color
Paths must not branch or contain loops
If the active variant requires full coverage, every cell must be filled
- Returns:
Truewhen the current state meets all validity rules, otherwiseFalse- Return type:
- _cell_has_color(cell: Coord, color_code: int) bool[source]¶
Return whether the given grid cell currently contains
color_code.For bridge cells check both vertical and horizontal lanes for the color.
- _can_occupy(cell: Coord, color_code: int, *, lane: Lane) bool[source]¶
Test whether a color can occupy a target cell for a specific lane.
For non-bridge cells the cell must be empty. For bridge cells the requested lane must be empty or already occupied by the same color.
- _occupy_cell(cell: Coord, color_code: int, *, lane: Lane) None[source]¶
Occupy a cell with the provided color code for the specified lane.
For endpoints on bridge cells the
laneargument may be'n'to indicate both lanes should be marked.- Parameters:
cell (Coord) – Target coordinate as
(row, col)color_code (int) – One-based color code to write into the grid
lane (Lane) – Lane specifier indicating which lane to set
- _mark_presence(ci: int, hi: int, cell: Coord) None[source]¶
Mark a coordinate as occupied in the per-arm presence grid.
- _clear_presence(ci: int, hi: int, cell: Coord) None[source]¶
Clear a coordinate from the per-arm presence grid.
- _erase_last(ci: int, hi: int) None[source]¶
Erase the last placed segment for a color arm.
Pop the last entry from the specified arm stack and clear the corresponding grid or lane value. Protect the starting endpoint for the specified head so it is not removed.
- clear_color_path(ci: int) None[source]¶
Clear all non-endpoint segments for color
ciand reset its connection flag.This utility is used by the interactive viewer when switching the active drawing endpoint. Remove every non-endpoint segment for both arms while leaving the original endpoints intact.
- Parameters:
ci (int) – Zero-based color index
- Raises:
ValueError – If
ciis outside the valid rangeRuntimeError – If invoked while the environment uses cell-switching mode
- _update_connection_status(ci: int) None[source]¶
Update the per-color connection flag by inspecting both arm heads.
A color is connected when both heads share the same position or when one head has reached the opposite endpoint and that endpoint’s stack length is greater than one.
- Parameters:
ci (int) – Zero-based color index
- _compute_action_mask() numpy.typing.NDArray[numpy.uint8][source]¶
Compute a binary action mask of valid actions for the current state.
Delegate to
numberlink.env.NumberLinkRGBEnv._compute_cell_switching_mask()ornumberlink.env.NumberLinkRGBEnv._compute_path_mask()depending on the active variant.- Returns:
One-dimensional array where nonzero entries mark valid actions
- Return type:
NDArray[np.uint8]
- _compute_cell_switching_mask() numpy.typing.NDArray[numpy.uint8][source]¶
Compute the action mask for cell-switching mode.
If
_cell_switch_maskisNonereturn a zero mask covering the full action space. Otherwise return a copy of the configured mask.- Returns:
Action mask for the cell-switching action space
- Return type:
NDArray[np.uint8]
- _compute_path_mask() numpy.typing.NDArray[numpy.uint8][source]¶
Compute the action mask for standard path-building mode.
The mask marks legal head moves for every color and arm head. Legal moves include backtracking, meeting the other head, stepping onto an untouched endpoint, and moving into an empty lane or cell while respecting bridge semantics.
- Returns:
One-dimensional binary mask for the discrete action space
- Return type:
NDArray[np.uint8]
- _render_rgb() ObsType[source]¶
Produce an RGB image representing the current grid state.
Normal cells are filled from the palette based on
_grid_codes. Bridge cells are composed by mixing lane colors when both lanes are present. Apply endpoint styling according to the active render configuration.- Returns:
RGB image with shape
(H, W, 3)and dtypeuint8- Return type:
ObsType
- _render_text() str[source]¶
Produce a compact textual representation of the grid for terminal output.
The header includes step count and the number of connected colors. The grid uses these symbols
.for empty regular cellsUppercase letter for endpoints
Lowercase letter for path segments
- For bridge cells the first character represents the vertical lane and the second the horizontal lane. Use
the same casing rules, or
*when the lane is empty
- Returns:
Multi-line string representing the grid and status line
- Return type:
- spec: gymnasium.envs.registration.EnvSpec | None = None[source]¶
- action_space: gymnasium.spaces.Space[gymnasium.core.ActType] = None[source]¶
- observation_space: gymnasium.spaces.Space[gymnasium.core.ObsType] = None[source]¶
- _np_random: numpy.random.Generator | None = None[source]¶
- property unwrapped: gymnasium.core.Env[gymnasium.core.ObsType, gymnasium.core.ActType][source]¶
- property np_random: numpy.random.Generator[source]¶