Categories
dev evolution Locomotion

DEAP

Distributed Evolutionary Algorithms in Python

deap.readthedocs.org/

An oldie but a goodie. Was thinking to just implement the locomotion using good old genetic programming.

You could probably generate a walking robot using a genetic algorithm that repeated the tree’s actions a number of times. Bet it would be faster than the whole neural network policy training replay buffer spiel.

https://deap.readthedocs.io/en/master/tutorials/advanced/gp.html

Categories
AI/ML deep dev Locomotion

Ray

https://docs.ray.io/en/master/rllib-algorithms.html Seems like this might be the most up to date baselines repo.

Ray is a fast and simple framework for building and running distributed applications.

Ray is packaged with the following libraries for accelerating machine learning workloads:

  • Tune: Scalable Hyperparameter Tuning
  • RLlib: Scalable Reinforcement Learning
  • RaySGD: Distributed Training Wrappers

ARS implementation: https://github.com/ray-project/ray/blob/master/rllib/agents/ars/ars.py

Categories
The Sentient Table

Hyperparameters and Rewards

I got the table walking with ARS, but pybullet saving just the perceptron weights didn’t seem to reload progress.

So I switched to PPO, which is a bit more complicated. Stable baselines PPO1 and PPO2 converged too easily, with the table opting to fall over all the time.

So I started editing with the reward function weights, changing it from weighing X axis movement by 1, and weighing Z axis movement by 0.5, to the opposite. So standing up is more important now. I also penalised falling over by a constant value. It’s not looking particularly smart after 11 rounds, but it’s not falling over forward anymore, at least. Yet.

I also changed some PPO hyperparams:

clip_param=0.4, entcoeff=0.2, timesteps_per_actorbatch=1024, 

Basically more exploration than before by allowing more variation in policy changes, and increasing some sort of entropy can’t hurt right? and giving it more time to evaluate per batch, as maybe falling over was as good as you could hope for, in a smaller batch.

This is a good summary of the state of the art in hyperparam tuning. I’ll probably need to do this soon. https://medium.com/criteo-labs/hyper-parameter-optimization-algorithms-2fe447525903

Combine PPO with NES to Improve Exploration https://arxiv.org/pdf/1905.09492.pdf

PBT https://arxiv.org/abs/1711.09846

https://deepmind.com/blog/article/population-based-training-neural-networks

Policy Optimization with Model-based Explorations https://arxiv.org/pdf/1811.07350.pdf

It seems like my fiddling with hyperparams caused ‘kl’

Karhunen-Loeve (KL) to go to NaN. I dunno.

Something about stochastic ‘eigenvector transform’. Similar to Fourier transform for sound, apparently.

So I need to tune hyperparams.

The Stable baselines allow you to change params https://stable-baselines.readthedocs.io/en/master/guide/examples.html#accessing-and-modifying-model-parameters

So kl becoming NaN that could mean i’m returning a zero somewhere from the model.

“In my case, adding 1e-8 to the divisor made the trick… ” – https://github.com/modestyachts/ARS/issues/1

https://stable-baselines.readthedocs.io/en/master/guide/checking_nan.html

Categories
Locomotion The Sentient Table

RL Toolboxes

stable baselines – https://stable-baselines.readthedocs.io/en/master/

Ray / RLLib – https://docs.ray.io/en/master/

OpenAI Spinning Up – https://spinningup.openai.com/en/latest/spinningup/keypapers.html

RLKit – https://github.com/vitchyr/rlkit

Garage – https://github.com/rlworkgroup/garage

Categories
Locomotion simulation

Obstacle course

I was thinking that when I get the RL or EA working on the robot, for locomotion, I can just put things in its way, to train it for climbing over obstacles.

It seems that swapping the 2D plane for noise generated terrain is a common first step towards training a more resilient robot in simulation.

Categories
dev Locomotion The Sentient Table

Stable baselines

Need something to compare results to.

To install,

https://stable-baselines.readthedocs.io/en/master/guide/install.html

pip install git+https://github.com/hill-a/stable-baselines

Successfully installed absl-py-0.9.0 astor-0.8.1 gast-0.2.2 google-pasta-0.2.0 grpcio-1.30.0 h5py-2.10.0 keras-applications-1.0.8 keras-preprocessing-1.1.2 opt-einsum-3.2.1 stable-baselines-2.10.1a1 tensorboard-1.15.0 tensorflow-1.15.3 tensorflow-estimator-1.15.1 termcolor-1.1.0 wrapt-1.12.1

i’d originally done

pip install stable-baselines[mpi]

but the github installs dependencies too.

ok so pybullet comes with an ‘enjoy’ program which

~/.local/lib/python3.6/site-packages/pybullet_envs/stable_baselines

You can run it using:

python3 -m pybullet_envs.stable_baselines.enjoy –algo td3 –env HalfCheetahBulletEnv-v0

Ok I set up ppo2 and tried to run python3 ppo2.py

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 106, in spec
    importlib.import_module(mod_name)
  File "/usr/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 994, in _gcd_import
  File "<frozen importlib._bootstrap>", line 971, in _find_and_load
  File "<frozen importlib._bootstrap>", line 953, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'gym-robotable'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "ppo2.py", line 29, in <module>
    env = gym.make(hp.env_name)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 142, in make
    return registry.make(id, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 86, in make
    spec = self.spec(path)
  File "/usr/local/lib/python3.6/dist-packages/gym/envs/registration.py", line 109, in spec
    raise error.Error('A module ({}) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`'.format(mod_name))
gym.error.Error: A module (gym-robotable) was specified for the environment but was not found, make sure the package is installed with `pip install` before calling `gym.make()`

Registration… hmm ARS.py doesn’t complain. We had this problem before.

pip3 install -e .
python3 setup.py install

nope… https://stackoverflow.com/questions/14295680/unable-to-import-a-module-that-is-definitely-installed it’s presumably here somewhere…

root@chrx:/opt/gym-robotable# pip show gym-robotable
Name: gym-robotable
Version: 0.0.1
Summary: UNKNOWN
Home-page: UNKNOWN
Author: UNKNOWN
Author-email: UNKNOWN
License: UNKNOWN
Location: /opt/gym-robotable
Requires: gym
Required-by: 

https://github.com/openai/gym/issues/1818 says You need to either import <name of your package> or do gym.make("<name of your package>:tic_tac_toe-v1"), see the creating environment guide for more information: https://github.com/openai/gym/blob/master/docs/creating-environments.md

Is it some fuckin gym-robotable vs gym_robotable thing?

Yes. Yes it is.


self.env_name = 'gym_robotable:RobotableEnv-v0'

Ok so now it’s working almost. But falls down sometimes and then the algorithm stops. Ah, needed to define ‘is_fallen’ correctly…

  def is_fallen(self):
    orientation = self.robotable.GetBaseOrientation()
    rot_mat = self._pybullet_client.getMatrixFromQuaternion(orientation)
    local_up = rot_mat[6:]
    pos = self.robotable.GetBasePosition()
    # return (np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)) < 0.85 or pos[2] < -0.25)
    #print("POS", pos)
    #print("DOT", np.dot(np.asarray([0, 0, 1]), np.asarray(local_up)))

    return (pos[2] < -0.28)  #changing fallen definition for now, to height of table
    #return False

  def _termination(self):
    position = self.robotable.GetBasePosition()
    distance = math.sqrt(position[0]**2 + position[1]**2)
    return self.is_fallen() or distance > self._distance_limit

ok so now


if __name__ == "__main__":

  hp = Hp()
  env = gym.make(hp.env_name)

  model = PPO2(MlpPolicy, env, verbose=1)
  model.learn(total_timesteps=10000)

  for episode in range(100):
      obs = env.reset()
      for i in range(1000):
        action, _states = model.predict(obs)
        obs, rewards, dones, info = env.step(action)
        #env.render()
        if dones:
            print("Episode finished after {} timesteps".format(i + 1))
            break
        env.render(mode="human")

Now to continue training… https://github.com/hill-a/stable-baselines/issues/599

Categories
AI/ML Locomotion robots sim2real simulation

Imitation Learning

This is the real ticket. Basically motion capture to speed up training. But when a robot can do this, we don’t need human workers anymore. (Except to provide examples of the actions to perform, and to build the first robot-building machine, or robot-building-building machines, etc.

videos: https://sites.google.com/view/nips2017-one-shot-imitation/home

arxiv: https://arxiv.org/pdf/1703.07326.pdf

abstract: https://arxiv.org/abs/1703.07326

Learning Agile Robotic Locomotion Skills by
Imitating Animals: https://xbpeng.github.io/projects/Robotic_Imitation/2020_Robotic_Imitation.pdf

Imitation is the ability to recognize and reproduce others’ actions – By extension, imitation learning is a means of learning and developing new skills from observing these skills performed by another agent. Imitation learning (IL) as applied to robots is a technique to reduce the complexity of search spaces for learning. When observing either good or bad examples, one can reduce the search for a possible solution, by either starting the search from the observed good solution (local optima), or conversely, by eliminating from the search space what is known as a bad solution. Imitation learning offers an implicit means of training a machine, such that explicit and tedious programming of a task by a human user can be minimized or eliminated. Imitation learning is thus a “natural” means of training a machine, meant to be accessible to lay people. – (https://link.springer.com/referenceworkentry/10.1007%2F978-1-4419-1428-6_758)

OpenAI’s https://openai.com/blog/robots-that-learn/

“We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.”

Categories
dev Locomotion The Sentient Table

ConvertFromLegModel

This is a confusing bit of code


  def ConvertFromLegModel(self, actions):
    """Convert the actions that use leg model to the real motor actions.
    Args:
      actions: The theta, phi of the leg model.
    Returns:
      The eight desired motor angles that can be used in ApplyActions().
    """
  

 COPY THE ACTIONS.

    motor_angle = copy.deepcopy(actions)

DEFINE SOME THINGS

    scale_for_singularity = 1
    offset_for_singularity = 1.5
    half_num_motors = int(self.num_motors / 2)
    quarter_pi = math.pi / 4

FOR EVERY MOTOR

    for i in range(self.num_motors):

THE ACTION INDEX IS THE FLOOR OF HALF. 00112233
      action_idx = int(i // 2)

WELL, SO, THE FORWARD BACKWARD COMPONENT is 
negative thingy times 45 degrees times (the action of the index plus half the motors.... plus the offset thingy)

      forward_backward_component = (
          -scale_for_singularity * quarter_pi *
          (actions[action_idx + half_num_motors] + offset_for_singularity))

AND SO THE EXTENSION COMPONENT IS either + or - 45 degrees times the action.

      extension_component = (-1)**i * quarter_pi * actions[action_idx]

IF 4,5,6,7 MAKE THAT THING NEGATIVE.

      if i >= half_num_motors:
        extension_component = -extension_component

THE ANGLE IS... PI + thingy 1 + thingy 2.

      motor_angle[i] = (math.pi + forward_backward_component + extension_component)



    return motor_angle

Ok my error is,

  File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 350, in step
    action = self._transform_action_to_motor_command(action)
  File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 313, in _transform_action_to_motor_command
    action = self.robotable.ConvertFromLegModel(action)
AttributeError: 'Robotable' object has no attribute 'ConvertFromLegModel'

Ok anyway i debugged for an hour and now it’s doing something. it’s saving numpy files now.

policy_RobotableEnv-v0_20200516-192435.npy contains:

�NUMPYv{‘descr’: ‘<f8’, ‘fortran_order’: False, ‘shape’: (4, 16), }

Cool.

But yeah i had to comment out a lot of stuff. Seems like the actions it’s generating are mostly 0.

Since I simplified to a table, turns out I don’t need any of that ConvertFromLegModel code.


Ok anyway, i started over with minitaur. lol. why are there two tables? Changing the motorDirections gave me this. Good progress.

Categories
Locomotion robots

Soft Tensegrity Robots

Soft Tensegrity Robots, jiggling around:

https://www.youtube.com/watch?v=SuLQDhrk9tQ

“Neat” – Youtube comment

Categories
dev The Sentient Table

Gym Env

Starting from pybullet/gym/pybullet_envs/bullet/minitaur_gym_env.py

and going by these instructions https://github.com/openai/gym/blob/master/docs/creating-environments.md

First had to remember the protocol buffer stuff https://developers.google.com/protocol-buffers/docs/reference/python-generated

protoc --proto_path=src --python_out=build/gen src/foo.proto src/bar/baz.proto

https://stackoverflow.com/questions/45068568/how-to-create-a-new-gym-environment-in-openai

/opt/gym-robotable# pip install -e .
Obtaining file:///opt/gym-robotable
Requirement already satisfied: gym>=0.2.3 in /usr/local/lib/python3.6/dist-packages (from gym-robotable==0.0.1) (0.17.1)
Requirement already satisfied: cloudpickle<1.4.0,>=1.2.0 in /usr/local/lib/python3.6/dist-packages (from gym>=0.2.3->gym-robotable==0.0.1) (1.3.0)
Requirement already satisfied: numpy>=1.10.4 in /usr/local/lib/python3.6/dist-packages (from gym>=0.2.3->gym-robotable==0.0.1) (1.18.1)
Requirement already satisfied: six in /usr/lib/python3/dist-packages (from gym>=0.2.3->gym-robotable==0.0.1) (1.11.0)
Requirement already satisfied: scipy in /usr/local/lib/python3.6/dist-packages (from gym>=0.2.3->gym-robotable==0.0.1) (1.4.1)
Requirement already satisfied: pyglet<=1.5.0,>=1.4.0 in /usr/local/lib/python3.6/dist-packages (from gym>=0.2.3->gym-robotable==0.0.1) (1.5.0)
Requirement already satisfied: future in /usr/local/lib/python3.6/dist-packages (from pyglet<=1.5.0,>=1.4.0->gym>=0.2.3->gym-robotable==0.0.1) (0.18.2)
Installing collected packages: gym-robotable
Running setup.py develop for gym-robotable
Successfully installed gym-robotable

When I try run the gym env, I get:

raise error.UnregisteredEnv(‘No registered env with id: {}’.format(id))

Ok it was some typo in the __init__.py

Fixed that up in the imports and __inits__ and now got this issue: https://answers.ros.org/question/326226/importerror-dynamic-module-does-not-define-module-export-function-pyinit__tf2/

Ok i think that was some file that was importing tensorflow 2. I don’t want to use tensorflow. it’s like 400MB.

URDF file ‘/root/.local/lib/python3.6/site-packages/pybullet_data/robot.urdf’ not found

so ok, let’s put it there.

/opt/gym-robotable/gym_robotable/envs# cp robot.urdf /root/.local/lib/python3.6/site-packages/pybullet_data/

AttributeError: ‘Robotable’ object has no attribute ‘Reset’

Ok so I’ve got robot.urdf looking like a table, and loading. But there’s various differences between the minitaur gym env and my own minitaur, the robotable. Gotta compare, etc. ok it wants reset() not Reset(…)

just a warning:

No inertial data for link, using mass=1, localinertiadiagonal = 1,1,1, identity local inertial frameb3Warning[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,126]:
torsob3Warning[examples/Importers/ImportURDFDemo/BulletUrdfImporter.cpp,126]:

happy with 1 defaults for mass and inertia settings, for now.

Ok so apparently just working my way through the bugs one by one. That’s probably as good a method as any :

File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 493, in _get_observation
observation.extend(self.robotable.GetMotorAngles().tolist())
AttributeError: ‘Robotable’ object has no attribute ‘GetMotorAngles’

K did a :%s/Get/get/g in vi. NEXT.

File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 301, in reset
return self._get_observation()
File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 493, in _get_observation
observation.extend(self.robotable.getMotorAngles().tolist())
File “/opt/gym-robotable/gym_robotable/envs/robotable.py”, line 88, in getMotorAngles
jointState = p.getJointState(self.quadruped, self.motorIdList[i])
IndexError: list index out of range

ok self.nMotors = 4

  File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 527, in _get_observation_upper_bound
    num_motors = self.robotable.num_motors
AttributeError: 'Robotable' object has no attribute 'num_motors'

k lets change nMotors to num_motors. NEXT

File “ars.py”, line 77, in ExploreWorker
state, reward, done, _ = env.step(action)
File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 350, in step
action = self._transform_action_to_motor_command(action)
File “/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py”, line 313, in _transform_action_to_motor_command
action = self.robotable.ConvertFromLegModel(action)
AttributeError: ‘Robotable’ object has no attribute ‘ConvertFromLegModel’