Somehow didn't find this until now, but it divides papers into categories, within machine learning topics/tasks


hope that works. It’s that guy on youtube who says ‘dear scholars’ and ‘what a time to be alive’.

looks like a nice UI for stuff

Sim2Real links

Using Simulation and Domain Adaptation to Improve
Efficiency of Deep Robotic Grasping:

Optimizing Simulations with Noise-Tolerant Structured Exploration:

The ‘magic’ underlying PyTorch

:That is true. As I wrote earlier, PyTorch is a jacobian-vector product engine. In the process it never explicitly constructs the whole Jacobian. It’s usually simpler and more efficient to compute the JVP directly.:


jacobian-vector products?


OpenAI Jukebox

This is trippy shit.

Gym Env

Starting from pybullet/gym/pybullet_envs/bullet/

and going by these instructions

First had to remember the protocol buffer stuff

protoc --proto_path=src --python_out=build/gen src/foo.proto src/bar/baz.proto

When I try run the gym env, I get:

Ok it was some typo in the

Fixed that up in the imports and __inits__ and now got this issue:

Ok i think that was some file that was importing tensorflow 2. I don’t want to use tensorflow. it’s like 400MB.

so ok, let’s put it there.

Ok so I’ve got robot.urdf looking like a table, and loading. But there’s various differences between the minitaur gym env and my own minitaur, the robotable. Gotta compare, etc. ok it wants reset() not Reset(…)

just a warning:

happy with 1 defaults for mass and inertia settings, for now.

Ok so apparently just working my way through the bugs one by one. That’s probably as good a method as any :

K did a :%s/Get/get/g in vi. NEXT.

ok self.nMotors = 4

k lets change nMotors to num_motors. NEXT

Hindsight Experience Replay (HER)

Hindsight Experience Replay

(Instead of getting a crap result and saying, wow what a crap result, you say, ah well if that’s what I had wanted to do, then that would have worked.)

It’s a clever idea, Open AI.

One ability humans have, unlike the current generation of model-free RL algorithms, is to learn almost as much from achieving an undesired outcome as from the desired one.

“So why not just pretend that we wanted to achieve this goal to begin with, instead of the one that we set out to achieve originally?”

The HER algorithm achieves this by using what is called “sparse and binary” rewards, which only provide an indication to the agent that either it has failed or succeeded. In contrast, the “dense,” “shaped” rewards used in conventional reinforcement learning tip agents off as to whether they are getting “close,” “closer,” “much closer,” or “very close” to hitting their goal. Such so-called dense rewards can speed up the learning process, but the drawback is that these dense rewards often don’t contain much of a learning signal for the agent to learn from, and can be difficult to design and implement for real-world applications.


Hindsight experience replay (HER) based on universal value functions shows promising results in such multi-goal settings by substituting achieved goals for the original goal, frequently giving the agent rewards. However, the achieved goals are limited to the current policy level and lack guidance for learning. We propose a novel guided goal-generation model for multi-goal RL named G-HER. Our method uses a conditional generative recurrent neural network (RNN) to explicitly model the relationship between policy level and goals, enabling the generation of various goals conditions on the different policy levels.


Robot “Forward Kinematics”

This doesn’t seem necessary any more, because we have simulators. But interesting to see what people had to do back in the day:

Denavit-Hartenberg (DH) parameters


ALA – Adversarial Latent Autoencoders

Similar to GANs, but a bit cleaner. Image training encodes a latent representation of the ur-Celebrity, and new images are generated from it using another image as an input.

Latent: (of a quality or state) existing but not yet developed or manifest; hidden or concealed.

ROS Camera Topic

What is a ros topic?
ROS can publish the webcam stream to a “topic”, and any part of the robot can subscribe to it, by name, if it is interested in that data. ROS is almost like a program where everything is a global variable.

I made this file for the laptop webcam, but then didn’t end up using it.

  <group ns="camera">
    <node pkg="libuvc_camera" type="camera_node" name="mycam">
      <!-- Parameters used to find the camera -->
      <param name="vendor" value="0x2232"/>
      <param name="product" value="0x1082"/>
      <param name="serial" value=""/>
      <!-- If the above parameters aren't unique, choose the first match: -->
      <param name="index" value="0"/>

      <!-- Image size and type -->
      <param name="width" value="640"/>
      <param name="height" value="480"/>
      <!-- choose whichever uncompressed format the camera supports: -->
      <param name="video_mode" value="uncompressed"/> <!-- or yuyv/nv12/mjpeg -->
      <param name="frame_rate" value="15"/>

      <param name="timestamp_method" value="start"/> <!-- start of frame -->
      <param name="camera_info_url" value="file:///tmp/cam.yaml"/>

      <param name="auto_exposure" value="3"/> <!-- use aperture_priority auto exposure -->
      <param name="auto_white_balance" value="false"/>


apt install ros-melodic-uvc-camera

rospack listnames

rosrun uvc_camera uvc_camera_node _device:=/dev/video0

rostopic list

(should show /image_raw now…)

rosrun dso_ros dso_live calib=/opt/catkin_ws/src/dso_ros/camera.txt image:=/image_raw/