Welcome to rewardGym’s documentation!
rewardGym
Ambitiously called rewardGym
, this is a part of the rewardMap
project.
The project’s goal is to provide two things:
A common language for reward tasks used in research.
A common interface to display and collect data for these tasks.
Under the hood this module uses the gymnasium [cit1]. The general package has been greatly inspired by neuro-nav [cit2], especially the use of a graph structure to represent the tasks.
Many thanks also to physiopy, from where I took many of the workflows and automatization around the repository (such as workflows and PR labels)!
Installation
I recommend creating a new python environment (using e.g. venv
or conda
).
Then install the package and all necessary dependencies using:
pip install git+https://github.com/rewardMap/rewardGym
Alternatively, download / clone the repository and install from there:
git clone https://github.com/rewardMap/rewardGym
cd rewardGym
pip install -e .
Usage
The package should be importable as usually. See the documentation for further information.
Use PsychoPy for data collection
There might be cases, where you want to use this package purely for data collection.
The current release, basic logging is supported.
This is also possible using PsychoPy Standalone [cit3] (only tested version v2023.2.3, early v2024 versions were incompatible due to the GUI structure).
For this clone or download the repository.
E.g.:
git clone https://github.com/rewardMap/rewardGym
IMPORTANT:
Afterwards, you can use the PsychoPy coder to run rewardgym_psychopy.py
, which is located in the root directory.
Outputs of this program will be saved by default in the data
directory.
Run the environment and train an agent
Running a task could look like the following
from rewardgym import get_env
from rewardgym.agents.base_agent import QAgent
env = get_env('hcp')
agent = QAgent(learning_rate=0.1, temperature=0.2,
action_space=env.n_actions, state_space=env.n_states)
n_episodes = 1000
for t in range(n_episodes):
obs, info = env.reset()
done = False
while not done:
action = agent.get_action(obs)
next_obs, reward, terminated, truncated, info = env.step(action)
agent.update(obs, action, reward, terminated, next_obs)
done = terminated or truncated
obs = next_obs
Contributing
First off, thanks for taking the time to contribute! ❤️
All types of contributions are encouraged and valued! Unfortunately, there is no detailed contribution guide - but it is planned!
If you have a question and do not find any answers in the Documentation or the documentation is unclear, please do not hesitate to open an Issue.
The same goes for any kind of bug report.
Before you make an enhancement, please open an issue first, where we will discuss if this is in the scope of the toolbox.
Finally, if you want to add a new task, also open an issue, and we will help you with implementing it in the toolbox.
Play a task (currently out of order)
To play one of the tasks using a simplified pygame implementation, you can e.g. run:
rg_play hcp --window 700 --n 5
To play the gambling task from the human connectome project, in a window of 700 x 700 pixels for 5 trials.
The available tasks are:
- hcp
Gambling task from the human connectome project. Response buttons are: left + right.
- mid
Monetary incentive delay task. Response button is: space
- two-step
The classic two-step task. Response buttons are: left + right
- risk-sensitive
Risk sensitive decision making task, contains both decision tasks between to outcome and singular event. Response buttons are: Left + right
- posner
Posner task. Response buttons are left + right.
- gonogo
Go / No-Go task, different stimuli indicate go to win, go to punish etc. Response button is: space.
References
Towers, M., Terry, J. K., Kwiatkowski, A., Balis, J. U., Cola, G. de, Deleu, T., Goulão, M., Kallinteris, A., KG, A., Krimmel, M., Perez-Vicente, R., Pierré, A., Schulhoff, S., Tai, J. J., Shen, A. T. J., & Younis, O. G. (2023). Gymnasium. Zenodo. https://doi.org/10.5281/zenodo.8127026
Juliani, A., Barnett, S., Davis, B., Sereno, M., & Momennejad, I. (2022). Neuro-Nav: A Library for Neurally-Plausible Reinforcement Learning (arXiv:2206.03312). arXiv. https://doi.org/10.48550/arXiv.2206.03312
Peirce, J., Gray, J. R., Simpson, S., MacAskill, M., Höchenberger, R., Sogo, H., Kastman, E., & Lindeløv, J. K. (2019). PsychoPy2: Experiments in behavior made easy. Behavior Research Methods, 51(1), 195–203. https://doi.org/10.3758/s13428-018-01193-y
Walkthrough
Deep Dive
Tasks