Need something to compare results to.
To install,
https://stable-baselines.readthedocs.io/en/master/guide/install.html
pip install git+https://github.com/hill-a/stable-baselines Successfully installed absl-py-0.9.0 astor-0.8.1 gast-0.2.2 google-pasta-0.2.0 grpcio-1.30.0 h5py-2.10.0 keras-applications-1.0.8 keras-preprocessing-1.1.2 opt-einsum-3.2.1 stable-baselines-2.10.1a1 tensorboard-1.15.0 tensorflow-1.15.3 tensorflow-estimator-1.15.1 termcolor-1.1.0 wrapt-1.12.1
i’d originally done
pip install stable-baselines[mpi]
but the github installs dependencies too.
ok so pybullet comes with an ‘enjoy’ program which
~/.local/lib/python3.6/site-packages/pybullet_envs/stable_baselines
You can run it using:
python3 -m pybullet_envs.stable_baselines.enjoy –algo td3 –env HalfCheetahBulletEnv-v0
Ok I set up ppo2 and tried to run python3 ppo2.py
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 106, in spec
importlib.import_module(mod_name)
File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'gym-robotable'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ppo2.py", line 29, in <module>
env = gym.make(hp.env_name)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 142, in make
return registry.make(id, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 86, in make
spec = self.spec(path)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 109, in spec
raise error.Error('A module ({}) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`'.format(mod_name))
gym.error.Error: A module (gym-robotable) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`
Registration… hmm ARS.py doesn’t complain. We had this problem before.
pip3 install -e .
python3 setup.py install
nope… https://stackoverflow.com/questions/14295680/unable-to-import-a-module-that-is-definitely-installed it’s presumably here somewhere…
root@chrx:/opt/gym-robotable# pip show gym-robotable
Name: gym-robotable
Version: 0.0.1
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /opt/gym-robotable
Requires: gym
Required-by:
https://github.com/openai/gym/issues/1818 says You need to either import <name of your package>
or do gym.make("<name of your package>:tic_tac_toe-v1")
, see the creating environment guide for more information: https://github.com/openai/gym/blob/master/docs/creating-environments.md
Is it some fuckin gym-robotable vs gym_robotable thing?
Yes. Yes it is.
self.env_name = 'gym_robotable:RobotableEnv-v0'
Ok so now it’s working almost. But falls down sometimes and then the algorithm stops. Ah, needed to define ‘is_fallen’ correctly…
def is_fallen(self):
orientation = self.robotable.GetBaseOrientation()
rot_mat = self._pybullet_client.getMatrixFromQuaternion(orientation)
local_up = rot_mat[6:]
pos = self.robotable.GetBasePosition()
# return (np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)) < 0.85 or pos[2] < -0.25)
#print("POS", pos)
#print("DOT", np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)))
return (pos[2] < -0.28) #changing fallen definition for now, to height of table
#return False
def _termination(self):
position = self.robotable.GetBasePosition()
distance = math.sqrt(position[0]**2 + position[1]**2)
return self.is_fallen() or distance > self._distance_limit
ok so now
if __name__ == "__main__":
hp = Hp()
env = gym.make(hp.env_name)
model = PPO2(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=10000)
for episode in range(100):
obs = env.reset()
for i in range(1000):
action, _states = model.predict(obs)
obs, rewards, dones, info = env.step(action)
#env.render()
if dones:
print("Episode finished after {} timesteps".format(i + 1))
break
env.render(mode="human")
Now to continue training… https://github.com/hill-a/stable-baselines/issues/599