An oldie but a goodie. Was thinking to just implement the locomotion using good old genetic programming.
You could probably generate a walking robot using a genetic algorithm that repeated the tree’s actions a number of times. Bet it would be faster than the whole neural network policy training replay buffer spiel.
I got the table walking with ARS, but pybullet saving just the perceptron weights didn’t seem to reload progress.
So I switched to PPO, which is a bit more complicated. Stable baselines PPO1 and PPO2 converged too easily, with the table opting to fall over all the time.
So I started editing with the reward function weights, changing it from weighing X axis movement by 1, and weighing Z axis movement by 0.5, to the opposite. So standing up is more important now. I also penalised falling over by a constant value. It’s not looking particularly smart after 11 rounds, but it’s not falling over forward anymore, at least. Yet.
Basically more exploration than before by allowing more variation in policy changes, and increasing some sort of entropy can’t hurt right? and giving it more time to evaluate per batch, as maybe falling over was as good as you could hope for, in a smaller batch.
I was thinking that when I get the RL or EA working on the robot, for locomotion, I can just put things in its way, to train it for climbing over obstacles.
It seems that swapping the 2D plane for noise generated terrain is a common first step towards training a more resilient robot in simulation.
Traceback (most recent call last):
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 106, in spec
importlib.import_module(mod_name)
File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 994, in _gcd_import
File "<frozen importlib._bootstrap>", line 971, in _find_and_load
File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'gym-robotable'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "ppo2.py", line 29, in <module>
env = gym.make(hp.env_name)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 142, in make
return registry.make(id, **kwargs)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 86, in make
spec = self.spec(path)
File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 109, in spec
raise error.Error('A module ({}) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`'.format(mod_name))
gym.error.Error: A module (gym-robotable) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`
Registration… hmm ARS.py doesn’t complain. We had this problem before.
if __name__ == "__main__":
hp = Hp()
env = gym.make(hp.env_name)
model = PPO2(MlpPolicy, env, verbose=1)
model.learn(total_timesteps=10000)
for episode in range(100):
obs = env.reset()
for i in range(1000):
action, _states = model.predict(obs)
obs, rewards, dones, info = env.step(action)
#env.render()
if dones:
print("Episode finished after {} timesteps".format(i + 1))
break
env.render(mode="human")
This is the real ticket. Basically motion capture to speed up training. But when a robot can do this, we don’t need human workers anymore. (Except to provide examples of the actions to perform, and to build the first robot-building machine, or robot-building-building machines, etc.
Imitation is the ability to recognize and reproduce others’ actions – By extension, imitation learning is a means of learning and developing new skills from observing these skills performed by another agent. Imitation learning (IL) as applied to robots is a technique to reduce the complexity of search spaces for learning. When observing either good or bad examples, one can reduce the search for a possible solution, by either starting the search from the observed good solution (local optima), or conversely, by eliminating from the search space what is known as a bad solution. Imitation learning offers an implicit means of training a machine, such that explicit and tedious programming of a task by a human user can be minimized or eliminated. Imitation learning is thus a “natural” means of training a machine, meant to be accessible to lay people. – (https://link.springer.com/referenceworkentry/10.1007%2F978-1-4419-1428-6_758)
“We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.”
def ConvertFromLegModel(self, actions):
"""Convert the actions that use leg model to the real motor actions.
Args:
actions: The theta, phi of the leg model.
Returns:
The eight desired motor angles that can be used in ApplyActions().
"""
COPY THE ACTIONS.
motor_angle = copy.deepcopy(actions)
DEFINE SOME THINGS
scale_for_singularity = 1
offset_for_singularity = 1.5
half_num_motors = int(self.num_motors / 2)
quarter_pi = math.pi / 4
FOR EVERY MOTOR
for i in range(self.num_motors):
THE ACTION INDEX IS THE FLOOR OF HALF. 00112233
action_idx = int(i // 2)
WELL, SO, THE FORWARD BACKWARD COMPONENT is
negative thingy times 45 degrees times (the action of the index plus half the motors.... plus the offset thingy)
forward_backward_component = (
-scale_for_singularity * quarter_pi *
(actions[action_idx + half_num_motors] + offset_for_singularity))
AND SO THE EXTENSION COMPONENT IS either + or - 45 degrees times the action.
extension_component = (-1)**i * quarter_pi * actions[action_idx]
IF 4,5,6,7 MAKE THAT THING NEGATIVE.
if i >= half_num_motors:
extension_component = -extension_component
THE ANGLE IS... PI + thingy 1 + thingy 2.
motor_angle[i] = (math.pi + forward_backward_component + extension_component)
return motor_angle
Ok my error is,
File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 350, in step
action = self._transform_action_to_motor_command(action)
File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 313, in _transform_action_to_motor_command
action = self.robotable.ConvertFromLegModel(action)
AttributeError: 'Robotable' object has no attribute 'ConvertFromLegModel'
Ok anyway i debugged for an hour and now it’s doing something. it’s saving numpy files now.
AttributeError: ‘Robotable’ object has no attribute ‘Reset’
Ok so I’ve got robot.urdf looking like a table, and loading. But there’s various differences between the minitaur gym env and my own minitaur, the robotable. Gotta compare, etc. ok it wants reset() not Reset(…)
just a warning:
No inertial data for link, using mass=1, localinertiadiagonal = 1,1,1, identity local inertial frameb3Warning[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,126]: torsob3Warning[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,126]:
happy with 1 defaults for mass and inertia settings, for now.
Ok so apparently just working my way through the bugs one by one. That’s probably as good a method as any :
File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 493, in _get_observation observation.extend(self.robotable.GetMotorAngles().tolist()) AttributeError: ‘Robotable’ object has no attribute ‘GetMotorAngles’
K did a :%s/Get/get/g in vi. NEXT.
File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 301, in reset return self._get_observation() File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 493, in _get_observation observation.extend(self.robotable.getMotorAngles().tolist()) File “/opt/gym-robotable/gym_robotable/envs/robotable.py”, line 88, in getMotorAngles jointState = p.getJointState(self.quadruped, self.motorIdList[i]) IndexError: list index out of range
ok self.nMotors = 4
File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 527, in _get_observation_upper_bound
num_motors = self.robotable.num_motors
AttributeError: 'Robotable' object has no attribute 'num_motors'
k lets change nMotors to num_motors. NEXT
File “ars.py”, line 77, in ExploreWorker state, reward, done, _ = env.step(action) File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 350, in step action = self._transform_action_to_motor_command(action) File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 313, in _transform_action_to_motor_command action = self.robotable.ConvertFromLegModel(action) AttributeError: ‘Robotable’ object has no attribute ‘ConvertFromLegModel’