Categories
AI/ML meta

Meta-learning & MAML

For like, having fall-back plans when things go wrong. Or like, phasing between policies, so you don’t “drop the ball”

https://arxiv.org/abs/1703.03400

Reminds me of Map-Elites, in that it collects behaviours.

“We propose an algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning. The goal of meta-learning is to train a model on a variety of learning tasks, such that it can solve new learning tasks using only a small number of training samples. In our approach, the parameters of the model are explicitly trained such that a small number of gradient steps with a small amount of training data from a new task will produce good generalization performance on that task. In effect, our method trains the model to be easy to fine-tune. We demonstrate that this approach leads to state-of-the-art performance on two few-shot image classification benchmarks, produces good results on few-shot regression, and accelerates fine-tuning for policy gradient reinforcement learning with neural network policies.”

Mostly based around the ES algorithm, they got a robot to walk straight again soon after hobbling it. https://ai.googleblog.com/2020/04/exploring-evolutionary-meta-learning-in.html

https://arxiv.org/pdf/2003.01239.pdf

“we present an evolutionary meta-learning algorithm
that enables locomotion policies to quickly adapt in noisy
real world scenarios. The core idea is to develop an efficient
and noise-tolerant adaptation operator, and integrate it into
meta-learning frameworks. We have shown that this Batch
Hill-Climbing operator works better in handling noise than
simply averaging rewards over multiple runs. Our algorithm
has achieved greater adaptation performance than the stateof-the-art MAML algorithms that are based on policy gradient. Finally, we validate our method on a real quadruped
robot. Trained in simulation, the locomotion policies can
successfully adapt to two real-world robot environments,
whose dynamics have been drastically changed.

In the future, we plan to extend our method in several
ways. First, we believe that we can replace the Gaussian
perturbations in the evolutionary algorithm with non-isotropic
samples to further improve the sample efficiency during
adaptation. With less robot data required for adaptation, we
plan to develop a lifelong learning system, in which the
robot can continuously collect data and quickly adjust its
policy to learn new skills and to operate optimally in new
environments
.”

Categories
AI/ML Locomotion robots sim2real simulation

Imitation Learning

This is the real ticket. Basically motion capture to speed up training. But when a robot can do this, we don’t need human workers anymore. (Except to provide examples of the actions to perform, and to build the first robot-building machine, or robot-building-building machines, etc.

videos: https://sites.google.com/view/nips2017-one-shot-imitation/home

arxiv: https://arxiv.org/pdf/1703.07326.pdf

abstract: https://arxiv.org/abs/1703.07326

Learning Agile Robotic Locomotion Skills by
Imitating Animals: https://xbpeng.github.io/projects/Robotic_Imitation/2020_Robotic_Imitation.pdf

Imitation is the ability to recognize and reproduce others’ actions – By extension, imitation learning is a means of learning and developing new skills from observing these skills performed by another agent. Imitation learning (IL) as applied to robots is a technique to reduce the complexity of search spaces for learning. When observing either good or bad examples, one can reduce the search for a possible solution, by either starting the search from the observed good solution (local optima), or conversely, by eliminating from the search space what is known as a bad solution. Imitation learning offers an implicit means of training a machine, such that explicit and tedious programming of a task by a human user can be minimized or eliminated. Imitation learning is thus a “natural” means of training a machine, meant to be accessible to lay people. – (https://link.springer.com/referenceworkentry/10.1007%2F978-1-4419-1428-6_758)

OpenAI’s https://openai.com/blog/robots-that-learn/

“We’ve created a robotics system, trained entirely in simulation and deployed on a physical robot, which can learn a new task after seeing it done once.”

Categories
AI/ML arxiv GANs

GANs in Keras

Came across this guy’s project

https://github.com/germain-hug/GANs-Keras

Mentioned some papers on GANs. Interesting for overview of related algorithms.

https://arxiv.org/abs/1511.06434 – Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks

https://arxiv.org/abs/1701.07875 – Wasserstein GAN

https://arxiv.org/abs/1411.1784 – Conditional Generative Adversarial Nets

https://arxiv.org/abs/1606.03657 – InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets

Categories
dev Locomotion The Sentient Table

ConvertFromLegModel

This is a confusing bit of code


  def ConvertFromLegModel(self, actions):
    """Convert the actions that use leg model to the real motor actions.
    Args:
      actions: The theta, phi of the leg model.
    Returns:
      The eight desired motor angles that can be used in ApplyActions().
    """
  

 COPY THE ACTIONS.

    motor_angle = copy.deepcopy(actions)

DEFINE SOME THINGS

    scale_for_singularity = 1
    offset_for_singularity = 1.5
    half_num_motors = int(self.num_motors / 2)
    quarter_pi = math.pi / 4

FOR EVERY MOTOR

    for i in range(self.num_motors):

THE ACTION INDEX IS THE FLOOR OF HALF. 00112233
      action_idx = int(i // 2)

WELL, SO, THE FORWARD BACKWARD COMPONENT is 
negative thingy times 45 degrees times (the action of the index plus half the motors.... plus the offset thingy)

      forward_backward_component = (
          -scale_for_singularity * quarter_pi *
          (actions[action_idx + half_num_motors] + offset_for_singularity))

AND SO THE EXTENSION COMPONENT IS either + or - 45 degrees times the action.

      extension_component = (-1)**i * quarter_pi * actions[action_idx]

IF 4,5,6,7 MAKE THAT THING NEGATIVE.

      if i >= half_num_motors:
        extension_component = -extension_component

THE ANGLE IS... PI + thingy 1 + thingy 2.

      motor_angle[i] = (math.pi + forward_backward_component + extension_component)



    return motor_angle

Ok my error is,

  File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 350, in step
    action = self._transform_action_to_motor_command(action)
  File "/opt/gym-robotable/gym_robotable/envs/robotable_gym_env.py", line 313, in _transform_action_to_motor_command
    action = self.robotable.ConvertFromLegModel(action)
AttributeError: 'Robotable' object has no attribute 'ConvertFromLegModel'

Ok anyway i debugged for an hour and now it’s doing something. it’s saving numpy files now.

policy_RobotableEnv-v0_20200516-192435.npy contains:

�NUMPYv{‘descr’: ‘<f8’, ‘fortran_order’: False, ‘shape’: (4, 16), }

Cool.

But yeah i had to comment out a lot of stuff. Seems like the actions it’s generating are mostly 0.

Since I simplified to a table, turns out I don’t need any of that ConvertFromLegModel code.


Ok anyway, i started over with minitaur. lol. why are there two tables? Changing the motorDirections gave me this. Good progress.

Categories
Vision

Early ConvNet visualisations

https://link.springer.com/article/10.1186/s40648-019-0141-2

https://imgur.com/a/Hqolp

AxCell: Automatic Extraction of Results from Machine Learning Papers

https://arxiv.org/abs/2004.14356

Categories
Locomotion robots

Soft Tensegrity Robots

Soft Tensegrity Robots, jiggling around:

https://www.youtube.com/watch?v=SuLQDhrk9tQ

“Neat” – Youtube comment

Categories
AI/ML evolution robots

resibots

another phd collab thing, on European Research Council grant https://www.resibots.eu/videos.html 2015-2020. Nice. They’re the ones who developed MAP-elites https://arxiv.org/abs/1504.04909

They https://members.loria.fr/JBMouret/nature_press.html had a paper published in Nature, for their bots that fix themselves.

MAP-Elites is interesting. It categorises behaviours and tests local optima, of some sort variables. Haven’t read the paper yet. It is windy.

“It creates a map of high-performing solutions at each point in a space defined by dimensions of variation that a user gets to choose. This Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm illuminates search spaces, allowing researchers to understand how interesting attributes of solutions combine to affect performance, either positively or, equally of interest, negatively. “

Categories
dev envs

Google Football Env

https://github.com/google-research/football
Google made an openAI gym environment for playing football.

It looks better than FIFA.

Categories
AI/ML deep highly_speculative

Cat Papers

Someone collected all the cat related AI papers: https://github.com/junyanz/CatPapers http://people.csail.mit.edu/junyanz/cat/cat_papers.html

Categories
CNNs deep Vision

MeshCNN

Currently we have LSD-SLAM working, and that’s cool for us humans to see stuff, but having an object mesh to work with makes more sense. I don’t know if there’s really any difference, but at least in terms of simulator integration, this makes sense. I’m thinking, there’s object detection, semantic segmentation, etc, etc, and in the end, I want the robot to have a relative coordinate system, in a way. But robots will probably get by with just pixels and stochastic magic.

But the big idea for me, here, is transform monocular camera images into mesh objects. Those .obj files or whatever, could be imported into the physics engine, for training in simulation.

arxiv: https://arxiv.org/pdf/1809.05910v2.pdf

github: https://ranahanocka.github.io/MeshCNN/

The PhD candidate: https://www.cs.tau.ac.il/~hanocka/ – In the Q&A at the end, she mentions AtlasNet https://arxiv.org/abs/1802.05384 as only being able to address local structures. Latest research looks interesting too https://arxiv.org/pdf/2003.13326.pdf

ShapeNET https://arxiv.org/abs/1512.03012 seems to be a common resource, and https://arxiv.org/pdf/2004.15004v2.pdf and these obj files might be interesting https://www.dropbox.com/s/w16st84r6wc57u7/shrec_16.tar.gz