Categories
dev Locomotion The Sentient Table

Stable baselines

Need something to compare results to.

To install,

https://stable-baselines.readthedocs.io/en/master/guide/install.html

pip install git+https://github.com/hill-a/stable-baselines

Successfully installed absl-py-0.9.0 astor-0.8.1 gast-0.2.2 google-pasta-0.2.0 grpcio-1.30.0 h5py-2.10.0 keras-applications-1.0.8 keras-preprocessing-1.1.2 opt-einsum-3.2.1 stable-baselines-2.10.1a1 tensorboard-1.15.0 tensorflow-1.15.3 tensorflow-estimator-1.15.1 termcolor-1.1.0 wrapt-1.12.1

i’d originally done

pip install stable-baselines[mpi]

but the github installs dependencies too.

ok so pybullet comes with an ‘enjoy’ program which

~/.local/lib/python3.6/site-packages/pybullet_envs/stable_baselines

You can run it using:

python3 -m pybullet_envs.stable_baselines.enjoy –algo td3 –env HalfCheetahBulletEnv-v0

Ok I set up ppo2 and tried to run python3 ppo2.py

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 106, in spec
    importlib.import_module(mod_name)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'gym-robotable'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ppo2.py", line 29, in <module>
    env = gym.make(hp.env_name)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 142, in make
    return registry.make(id, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 86, in make
    spec = self.spec(path)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 109, in spec
    raise error.Error('A module ({}) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`'.format(mod_name))
gym.error.Error: A module (gym-robotable) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`

Registration… hmm ARS.py doesn’t complain. We had this problem before.

pip3 install -e .
python3 setup.py install

nope… https://stackoverflow.com/questions/14295680/unable-to-import-a-module-that-is-definitely-installed it’s presumably here somewhere…

root@chrx:/opt/gym-robotable# pip show gym-robotable
Name: gym-robotable
Version: 0.0.1
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /opt/gym-robotable
Requires: gym
Required-by: 

https://github.com/openai/gym/issues/1818 says You need to either import <name of your package> or do gym.make("<name of your package>:tic_tac_toe-v1"), see the creating environment guide for more information: https://github.com/openai/gym/blob/master/docs/creating-environments.md

Is it some fuckin gym-robotable vs gym_robotable thing?

Yes. Yes it is.


self.env_name = 'gym_robotable:RobotableEnv-v0'

Ok so now it’s working almost. But falls down sometimes and then the algorithm stops. Ah, needed to define ‘is_fallen’ correctly…

  def is_fallen(self):
    orientation = self.robotable.GetBaseOrientation()
    rot_mat = self._pybullet_client.getMatrixFromQuaternion(orientation)
    local_up = rot_mat[6:]
    pos = self.robotable.GetBasePosition()
    # return (np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)) < 0.85 or pos[2] < -0.25)
    #print("POS", pos)
    #print("DOT", np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)))

    return (pos[2] < -0.28)  #changing fallen definition for now, to height of table
    #return False

  def _termination(self):
    position = self.robotable.GetBasePosition()
    distance = math.sqrt(position[0]**2 + position[1]**2)
    return self.is_fallen() or distance > self._distance_limit

ok so now


if __name__ == "__main__":

  hp = Hp()
  env = gym.make(hp.env_name)

  model = PPO2(MlpPolicy, env, verbose=1)
  model.learn(total_timesteps=10000)

  for episode in range(100):
      obs = env.reset()
      for i in range(1000):
        action, _states = model.predict(obs)
        obs, rewards, dones, info = env.step(action)
        #env.render()
        if dones:
            print("Episode finished after {} timesteps".format(i + 1))
            break
        env.render(mode="human")

Now to continue training… https://github.com/hill-a/stable-baselines/issues/599