rewardgym.environments
rewardgym.environments.base_env
- class rewardgym.environments.base_env.BaseEnv(environment_graph: Dict, reward_locations: Dict, render_mode: str | None = None, info_dict: Dict | None = None, seed: int | Generator = 1000, name: str | None = None, n_actions: int | None = None, reduced_actions: int | None = None)[source]
Bases:
Env
The basic environment class for the rewardGym module.
- __init__(environment_graph: Dict, reward_locations: Dict, render_mode: str | None = None, info_dict: Dict | None = None, seed: int | Generator = 1000, name: str | None = None, n_actions: int | None = None, reduced_actions: int | None = None)[source]
The core environment used for modeling and in part for displays.
- Parameters:
environment_graph (dict) – The main graph showing the association between states and actions.
reward_locations (dict) – Which location in the graph are associated with a reward.
render_mode (str, optional) – If using rendering or not, by default None
info_dict (dict, optional) – Additional information, that should be associated with a node, by default defaultdict(int)
seed (Union[int, np.random.Generator], optional) – The random seed associated with the environment, creates a generator, by default 1000
- reset(agent_location: int = 0, condition: int | None = None) Tuple[int | array, Dict] [source]
Resetting the environment, moving everything to start. Using conditions and agent_locations to specify task features.
- Parameters:
agent_location (int, optional) – Where in the graph the agent should be placed, by default None
condition (int, optional) – Setting a potential condition for the trial, by default None
- Returns:
The observation at that node in the graph and the associated info.
- Return type:
Tuple[Union[int, np.array], dict]
- step(action: int | None = None, step_reward: bool = False) Tuple[int | array, int, bool, bool, dict] [source]
Stepping through the graph - acquire a new observation in the graph.
- Parameters:
action (int, optional) – the action made by an agent, by default None
step_reward (bool, optional) – Only necessary, if rewards are episode sensitive, if True calls all reward objects, not only the selected one (while ignoring their output), by default False
- Returns:
The new observation, the reward associated with an action, if the episode is terminated, if the episode has been truncated (False), and the new observation’s info.
- Return type:
Tuple[Union[int, np.array], int, bool, bool, dict]
rewardgym.environments.render_env
- class rewardgym.environments.render_env.RenderEnv(environment_graph: dict, reward_locations: dict, render_mode: str | None = None, info_dict: dict = {}, seed: int | Generator = 1000, name: str | None = None)[source]
Bases:
BaseEnv
- __init__(environment_graph: dict, reward_locations: dict, render_mode: str | None = None, info_dict: dict = {}, seed: int | Generator = 1000, name: str | None = None)[source]
Environment to render tasks to the screen using pygame.
- Parameters:
environment_graph (dict) – The main graph showing the association between states and actions.
reward_locations (dict) – Which location in the graph are associated with a reward.
render_mode (str, optional) – If using rendering or not, by default None
info_dict (dict, optional) – Additional information, that should be associated with a node, by default defaultdict(int)
seed (Union[int, np.random.Generator], optional) – The random seed associated with the environment, creates a generator, by default 1000