rewardgym.environments

rewardgym.environments.base_env

class rewardgym.environments.base_env.BaseEnv(environment_graph: Dict, reward_locations: Dict, render_mode: str | None = None, info_dict: Dict | None = None, seed: int | Generator = 1000, name: str | None = None, n_actions: int | None = None, reduced_actions: int | None = None)[source]

Bases: Env

The basic environment class for the rewardGym module.

The core environment used for modeling and in part for displays.

Parameters:

environment_graph (dict) – The main graph showing the association between states and actions.
reward_locations (dict) – Which location in the graph are associated with a reward.
render_mode (str, optional) – If using rendering or not, by default None
info_dict (dict, optional) – Additional information, that should be associated with a node, by default defaultdict(int)
seed (Union[int, np.random.Generator], optional) – The random seed associated with the environment, creates a generator, by default 1000

reset(agent_location: int = 0, condition: int | None = None) → Tuple[int | array, Dict][source]

Resetting the environment, moving everything to start. Using conditions and agent_locations to specify task features.

Parameters:

agent_location (int, optional) – Where in the graph the agent should be placed, by default None
condition (int, optional) – Setting a potential condition for the trial, by default None

Returns:

The observation at that node in the graph and the associated info.

Return type:

Tuple[Union[int, np.array], dict]

step(action: int | None = None, step_reward: bool = False) → Tuple[int | array, int, bool, bool, dict][source]

Stepping through the graph - acquire a new observation in the graph.

Parameters:

action (int, optional) – the action made by an agent, by default None
step_reward (bool, optional) – Only necessary, if rewards are episode sensitive, if True calls all reward objects, not only the selected one (while ignoring their output), by default False

Returns:

The new observation, the reward associated with an action, if the episode is terminated, if the episode has been truncated (False), and the new observation’s info.

Return type:

Tuple[Union[int, np.array], int, bool, bool, dict]

rewardgym.environments.render_env

class rewardgym.environments.render_env.RenderEnv(environment_graph: dict, reward_locations: dict, render_mode: str | None = None, info_dict: dict = {}, seed: int | Generator = 1000, name: str | None = None)[source]

Bases: BaseEnv

__init__(environment_graph: dict, reward_locations: dict, render_mode: str | None = None, info_dict: dict = {}, seed: int | Generator = 1000, name: str | None = None)[source]

Environment to render tasks to the screen using pygame.

Parameters:

environment_graph (dict) – The main graph showing the association between states and actions.
reward_locations (dict) – Which location in the graph are associated with a reward.
render_mode (str, optional) – If using rendering or not, by default None
info_dict (dict, optional) – Additional information, that should be associated with a node, by default defaultdict(int)
seed (Union[int, np.random.Generator], optional) – The random seed associated with the environment, creates a generator, by default 1000

close() → None[source]: Closes the pygame display and quits pygame.