Status: Maintenance (expect bug fixes and minor updates)
OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms. This is the gym
open-source library, which gives you access to a standardized set of environments.
gym
makes no assumptions about the structure of your agent, and is compatible with any numerical computation library, such as TensorFlow or Theano. You can use it from Python code, and soon from other languages.
If you're not sure where to start, we recommend beginning with the docs on our site. See also the FAQ.
A whitepaper for OpenAI Gym is available at http://arxiv.org/abs/1606.01540, and here's a BibTeX entry that you can use to cite it in a publication:
@misc{1606.01540, Author = {Greg Brockman and Vicki Cheung and Ludwig Pettersson and Jonas Schneider and John Schulman and Jie Tang and Wojciech Zaremba}, Title = {OpenAI Gym}, Year = {2016}, Eprint = {arXiv:1606.01540}, }
Contents of this document
There are two basic concepts in reinforcement learning: the environment (namely, the outside world) and the agent (namely, the algorithm you are writing). The agent sends actions to the environment, and the environment replies with observations and rewards (that is, a score).
The core gym interface is Env, which is
the unified environment interface. There is no interface for agents;
that part is left to you. The following are the Env
methods you
should know:
We currently support Linux and OS X running Python 3.5 -- 3.8 Windows support is experimental - algorithmic, toy_text, classic_control and atari should work on Windows (see next section for installation instructions); nevertheless, proceed at your own risk.
You can perform a minimal install of gym
with:
git clone https://github.com/openai/gym.git
cd gym
pip install -e .
If you prefer, you can do a minimal install of the packaged version directly from PyPI:
pip install gym
You'll be able to run a few environments right away:
pyglet
to render though)We recommend playing with those environments at first, and then later installing the dependencies for the remaining environments.
You can also run gym on gitpod.io to play with the examples online. In the preview window you can click on the mp4 file you want to view. If you want to view another mp4 file, just press the back button and click on another mp4 file.
To install the full set of environments, you'll need to have some system packages installed. We'll build out the list here over time; please let us know what you end up installing on your platform. Also, take a look at the docker files (py.Dockerfile) to see the composition of our CI-tested images.
On Ubuntu 16.04 and 18.04:
apt-get install -y libglu1-mesa-dev libgl1-mesa-dev libosmesa6-dev xvfb ffmpeg curl patchelf libglfw3 libglfw3-dev cmake zlib1g zlib1g-dev swig
MuJoCo has a proprietary dependency we can't set up for you. Follow
the
instructions
in the mujoco-py
package for help. Note that we currently do not support MuJoCo 2.0 and above, so you will need to install a version of mujoco-py which is built
for a lower version of MuJoCo like MuJoCo 1.5 (example - mujoco-py-1.50.1.0
).
As an alternative to mujoco-py
, consider PyBullet which uses the open source Bullet physics engine and has no license requirement.
Once you're ready to install everything, run pip install -e '.[all]'
(or pip install 'gym[all]'
).
To run pip install -e '.[all]'
, you'll need a semi-recent pip.
Please make sure your pip is at least at version 1.5.0
. You can
upgrade using the following: pip install --ignore-installed
pip
. Alternatively, you can open setup.py and
install the dependencies by hand.
If you're trying to render video on a server, you'll need to connect a
fake display. The easiest way to do this is by running under
xvfb-run
(on Ubuntu, install the xvfb
package):
xvfb-run -s "-screen 0 1400x900x24" bash
If you'd like to install the dependencies for only specific environments, see setup.py. We maintain the lists of dependencies on a per-environment group basis.
See List of Environments and the gym site.
For information on creating your own environments, see Creating your own Environments.
See the examples
directory.
We are using pytest for tests. You can run them via:
pytest
2018-02-28: Release of a set of new robotics environments.
2018-01-25: Made some aesthetic improvements and removed unmaintained parts of gym. This may seem like a downgrade in functionality, but it is actually a long-needed cleanup in preparation for some great new things that will be released in the next month.
- Now your Env and Wrapper subclasses should define step, reset, render, close, seed rather than underscored method names.
- Removed the board_game, debugging, safety, parameter_tuning environments since they're not being maintained by us at OpenAI. We encourage authors and users to create new repositories for these environments.
- Changed MultiDiscrete action space to range from [0, ..., n-1] rather than [a, ..., b-1].
- No more render(close=True), use env-specific methods to close the rendering.
- Removed scoreboard directory, since site doesn't exist anymore.
- Moved gym/monitoring to gym/wrappers/monitoring
- Add dtype to Space.
- Not using python's built-in module anymore, using gym.logger
2018-01-24: All continuous control environments now use mujoco_py >= 1.50. Versions have been updated accordingly to -v2, e.g. HalfCheetah-v2. Performance should be similar (see https://github.com/openai/gym/pull/834) but there are likely some differences due to changes in MuJoCo.
2017-06-16: Make env.spec into a property to fix a bug that occurs when you try to print out an unregistered Env.
2017-05-13: BACKWARDS INCOMPATIBILITY: The Atari environments are now at v4. To keep using the old v3 environments, keep gym <= 0.8.2 and atari-py <= 0.0.21. Note that the v4 environments will not give identical results to existing v3 results, although differences are minor. The v4 environments incorporate the latest Arcade Learning Environment (ALE), including several ROM fixes, and now handle loading and saving of the emulator state. While seeds still ensure determinism, the effect of any given seed is not preserved across this upgrade because the random number generator in ALE has changed. The *NoFrameSkip-v4 environments should be considered the canonical Atari environments from now on.
2017-03-05: BACKWARDS INCOMPATIBILITY: The configure method has been removed from Env. configure was not used by gym, but was used by some dependent libraries including universe. These libraries will migrate away from the configure method by using wrappers instead. This change is on master and will be released with 0.8.0.
2016-12-27: BACKWARDS INCOMPATIBILITY: The gym monitor is now a wrapper. Rather than starting monitoring as env.monitor.start(directory), envs are now wrapped as follows: env = wrappers.Monitor(env, directory). This change is on master and will be released with 0.7.0.
2016-11-1: Several experimental changes to how a running monitor interacts with environments. The monitor will now raise an error if reset() is called when the env has not returned done=True. The monitor will only record complete episodes where done=True. Finally, the monitor no longer calls seed() on the underlying env, nor does it record or upload seed information.
2016-10-31: We're experimentally expanding the environment ID format to include an optional username.
2016-09-21: Switch the Gym automated logger setup to configure the root logger rather than just the 'gym' logger.
2016-08-17: Calling close on an env will also close the monitor and any rendering windows.
2016-08-17: The monitor will no longer write manifest files in real-time, unless write_upon_reset=True is passed.
2016-05-28: For controlled reproducibility, envs now support seeding (cf #91 and #135). The monitor records which seeds are used. We will soon add seed information to the display on the scoreboard.
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。