您的位置:首页 > 大数据 > 人工智能

Play with OpenAI Gym in Ubuntu 16.04: Hello World

2017-07-12 20:13 519 查看
OpenAI Gym is a toolkit for developing and comparing reinforcement learning algorithms.


git clone https://github.com/openai/gym cd gym
sudo pip install -e .

That’s minimal install. You can also try full install.

A Demo

import gym
env = gym.make('CartPole-v0')
for _ in range(1000):

This is just a demo to verify that your gym works well.


import gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(100):
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print("Episode finished after {} timesteps".format(t+1))

The environment’s step function returns exactly what we need. In fact, step returns four values. These are:

observation(object): An environment-specific object representing your observation of the environment, i.e. the state.

reward(float): Rewards achieved by the previous action.

done(boolean): The sign of the termination of an episode.

info(dict): Diagnostic information useful for debugging.

Every environment comes with first-class Space objects that describe the valid actions and observations.

#> Discrete(2)
#> Box(4,)
#> array([ 2.4       ,         inf,  0.20943951,         inf])
#> array([-2.4       ,        -inf, -0.20943951,        -inf])

The Discrete space allows a fixed range of non-negative numbers, so in this case valid actions are either 0 or 1. The Box space represents an n-dimensional box, so valid observations will be an array of 4 numbers. Box and Discrete are the most common Spaces. You can sample from a Space or check that something belongs to it:

from gym import spaces
space = spaces.Discrete(8) # Set with 8 elements {0, 1, 2, ..., 7}
x = space.sample()
assert space.contains(x)
assert space.n == 8


from gym import envs

This will give you a list of EnvSpecs

Record & Update

Wrap your environment with a Monitor Wrapper as follows:

import gym
from gym import wrappers
env = gym.make('CartPole-v0')
env = wrappers.Monitor(env, '/tmp/cartpole-experiment-1')
for i_episode in range(20):
observation = env.reset()
for t in range(100):
action = env.action_space.sample()
observation, reward, done, info = env.step(action)
if done:
print("Episode finished after {} timesteps".format(t+1))

You may install ffmpeg firstly:

sudo apt-get install ffmpeg

You can then upload your results to OpenAI Gym:

import gym
gym.upload('/tmp/cartpole-experiment-1', api_key='YOUR_API_KEY')
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签:  python