“They observed that many quadrupedal, mammalian animals feature a distinguished functional three-segment front leg and hind leg design, and proposed a “pantograph” leg abstraction for robotic research.”
1 DOF (degree of freedom). 1 motor. Miranda wants jointed legs, and I don’t want to work out inverse kinematics, so this looks ideal. Maybe a bit complicated still.
The simpler force diagram:
Compliance is a feature, made possible by springs typically.
A homemade attempt here with the Mojo robot of the Totally Not Evil Robot Army. Their robot only uses 9g servos, and can’t quite pick itself up.
I did an initial design with what I had around, and it turns out compliance is a delicate balance. Too much spring, and it just mangles itself up. Too little spring and it can’t lift off the ground.
Further iterations removed the springs, which were too tight by far, and used cable ties to straighten the legs, but the weight of the robot is a little bit too much for the knee joints.
I will likely leave it until I have a 3d printer, some better springs, and will give it another try with more tools and materials available. Maybe even hydraulics, some day,
This post follows the ‘Finding where we left off’ post, focused on locomotion sim2real. In that post I tried to generalise and smooth the leg angle servo movements in their -PI/2 to PI/2 range.
I will likely try extracting kMPs, before this is all over, which from a skim read, and look at the pictures, are like, just taking a single slice of the wave data, and repeating that. Or, taking consecutive periodic waves, and extracting the average / normalized movement from them.
They just used PCA. (Principal Component Analysis). That’s like a common ML toolkit thing.
See now this is where they lose me: “The covariance matrix of the normalized dataset”. Come on guys. Throw us a bone.
I found this picture, which is worth 1000 words, in the discussion on stackexchange about PCA and SVD:
So, I’m not quite ready for PCA. That is two dimensions, anyway. Oh right, so I need to add a ‘time’ dimension. numpy’s expand_dims?
I played around with Codex, to assist with finding the peaks, and to find the period length.
And I separated them out to different plots… and got the peaks matching once I passed in ( , distance=80).
I had to install these, and restart the Jupyter kernel (and I think close and restart the Chrome tab.) in order to get some matplotlib widgets.
Error message:
Jupyter Lab: Error displaying widget: model not found
!pip3 install --upgrade jupyterlab ipympl
%matplotlib widget
I started on a slider widget to draw a vertical line on top of the leg data, but I need to fix the refresh issue. Anyhow, it’s not quite what i want. What do I want?
So, I want the kMPs. The kMPs are like, a gif of a basic action, e.g. robot taking a full step forward, on all legs, which we can run once, twice, etc.
We can ‘average’ or ‘normalise’ or ‘phase’ the waves, and assume that gives us a decent average step forward.
I think there’s enough variation in this silly simulation walk that we should start with just the simplest, best single wave.
But since they ran PCA, let’s run it to see what it does for the data. We have a single integer value, which is 1D. To make it 2D, so we can run PCA on it… we add a time dimension?
But also, so I measured the period a few programs up, to be
67 steps (front right),
40 steps (front left),
59 steps (back right),
42 steps (back left).
So, as a starting point, it would be nice to be as close to servos at 90 degrees as possible. If I iterate the values, and track the lowest sum diff, yeah… is that it? I’m looking at this link at SO.
Ideally I could visualise the options..
Repeating a slice. Averaging the slices.
Ok, so I need a start index, end index, to index a range.
After some investigation, the index where the legs are closest to 90 degrees, is at 1739
Computer Enhance
So that’s kinda close to our ideal kMPs, from about 1739 to about 1870 maybe, but clearly the data is messy. Could be tweaked. Wavetable editor, basically.
Alright, let’s make an app. We can try run a Flask server on the Pi, with Javascript front end using chart.js.
pip3 install flask
Save the test web app, kmpapp.py
from flask import Flask
app = Flask(__name__)
@app.route('/')
def index():
return 'Hello world'
if __name__ == '__main__':
app.run(debug=True, host='0.0.0.0')
python3 kmpapp.py
Ok good start. We need to get the x and y data into JSON so Javascript can plot it, in chart.js
That’s looking good. Maybe too many points. Ok, so I want to edit, save, and run the KMPs on the robot.
Well it took a day but it’s working, and is pretty cool. Used smooth.js to allow smoother transitions. Took another day to add save and load features.
I’ve been scouring for existing code to help with developing the gripper in simulation. I was looking for a way to implement ‘eye-in-hand’ visual servoing, and came across a good resource, created for a masters thesis, which shows a ‘robot vision’ window, and he compares depth sensing algorithms. My approach was going to be, essentially, segmentation, in order to detect and localise chickens and eggs, in the field of vision, and then just try get their shape into an X-Y coordinate position, and over a certain size, to initiate interaction.
This one uses an SDF model of a KUKA industrial 6 DOF robot with a two finger gripper, but that has specific rotational movement, that seems maybe different from a simpler robot arm. So it’s maybe a bit overkill, and I might just want to see his camera code.
Miranda’s gripper prototype isn’t a $50k KUKA industrial robot arm. It’s just v.0.1 and got an 11kg/cm MG945, some 5kg/cm MG5010s, and an 1.3kg/cm SG90, and a sucker contraption I found on DFRobot, that can suck eggs.
So, regarding the simulation,this will be on top of the robot, as its head.
So we need an URDF file. Or an SDF file. There’s a couple ways to go with this.
The other resource I’ve found that looks like just what I need, is ur5pybullet
Regarding the ‘visual servoing’, the state of the art appears to be QT-Opt, perhaps. Or maybe RCAN, built on top of it. But we’re not there just yet. Another project specifically uses pybullet. Some extra notes here, from Sergey Levine, and co., associated with most of these projects.
Another good one is Retina-GAN, where they convert both simulation and reality into a canonical format. I’ve also come across Dex-Net before, from UCB.
Everything is very complicated though.
I’ve managed to make an URDF that looks good enough to start with, though. I’ll put everything in a github. We want to put two servos on the ‘head’ for animatronic emotional aesthetics, but there’s a sucker contraption there for the egg, so I think this is good enough for simulation, for now, anyway. I just need to put a camera on its head, put some eggs in the scene, and maybe reward stable contact with the tip. Of course it’s going to be a lot of work.
We also want to add extra leg parts, but I don’t want to use 4 more motors on it.
So I’m playing around with some aluminium and timing belts and pulleys to get 8 leg parts on 4 motors. Something like this, with springs if we can find some.
So, simulator camera vision. I can enable the GUI. Turns out I just need to press ‘g’ to toggle.
Ok I’ve got the visuals now, but I shouldn’t be seeing that shadow
The camera is like 90 degrees off maybe. Could be an issue with the camera setup, or maybe the URDF setup? Ok…
Changing the initial camera vector fixed the view somewhat:
init_camera_vector = (0, 0, 1) # x-axis
Except that we’re looking backwards now.
init_camera_vector = (0, 0, -1) # x-axis
Ok well it’s correct now, but heh, hmm. Might need to translate the camera just a bit higher.
I found a cool free chicken obj file with Creative commons usage. And an egg.
Heh need to resize obj files. Collision physics is fun.
Ok I worked out how to move the camera a bit higher.
pos = list(pos)
pos[2] += 0.3
pos = tuple(pos)
Alright! Getting somewhere.
So, next, I add resized eggs and some chickens for good measure, to the scene.
Then we need to train it to stick its shnoz on the eggs.
Ok… gonna have to train this sucker now.
First, the table is falling from the sky, so I might need to stabilize it first. I also need to randomize the egg location a bit.
And I want to minimize the distance between the gripper attachment and the egg.
The smart way is probably to have it walk until some condition and then grasp, but in the spirit of letting the robot learn things by itself, I will probably ignore heuristics. If I do decide to use heuristics, it will probably be a finite state machine with ‘walking’ mode and ‘gripping’ mode. But we’ll come back to this when it’s necessary. Most of the time there won’t be any eggs in sight. So it will just need to walk around until it is sure there is an egg somewhere in sight.
And the end effector’s position should be something like the original camera position before we moved it up a bit, plus length of the end effector in the URDF (0.618). I ended up doing this:
After checking out the speed of image segmentation on the Raspberry Pi (like one frame every 10 seconds maybe?), and my i3 laptop not being much better, I realised I needed more computing power, at least to train the neural networks. I can probably still ultimately run the neural network on the Pi, but we’ll see.
Looking at computing options, I ultimately went with the $399 NVIDIA Jetson Xavier NX.
Developer Kit Technical Specifications
GPU
NVIDIA Volta™ architecture with 384 NVIDIA® CUDA® cores and 48 Tensor cores
Gigabit Ethernet, M.2 Key E (WiFi/BT included), M.2 Key M (NVMe)
Display
HDMI and DP
USB
4x USB 3.1, USB 2.0 Micro-B
Others
GPIOs, I2C, I2S, SPI, UART
Mechanical
103 mm x 90.5 mm x 34 mm
Also, did you know they made a dystopic reboot retcon of The Jetsons, that 70s retro-futuristic Hanna Barbera cartoon, in comic form? An ice meteor destoyed Earth. They were lucky to have had a place in space to go, working for the Spacely Space Sprockets, incorporated. (YT link)
It took an hour to set up, and was mostly straightforward, though I had to get a ‘clover plug’ cable, and an SD card.
I used Etcher to load the latest 6GB Jetson Developer Kit SD image, and had a keyboard, mouse, and hdmi monitor that worked. So I was able to enter the wifi SSID and password while setting it up.
I learned that one option for headless installation is to use a USB cable from your computer to the micro-USB input of the Jetson. But ultimately this wasn’t necessary. I ran ipconfig on the Jetson, got an ip address, and connected with ssh.
After needing to change the wifi details, I used the usb cable, then connected with:
Coming back to this later, I attempted the same, but with the Jetson Nano, instead of the Jetson Xavier, and it didn’t work. I learned that the Nano doesn’t come with a Wifi adapter.
I think with the Nano, (“B01”) you need a monitor to install. I tried multiple tutorials, ssh’ing 192.168.55.1, I tried using screen to connect to /dev/ttyACC0 at 115200 baud, nope. Looked at the forums, and it’s complicated. I didn’t try the USB UART because my USB-TTL converter’s cable colours are different.
Another method that worked, with the nano, is plugging the ethernet cable from the Nano directly into the wifi router. It then shows up on the router’s network.
Later, when trying to install a D-Link wifi ‘Wireless N Nano USB Adaptor’, ( for the love of God, just get an Edimax – they work out of the box), I connected over ssh with the ethernet cable from the jetson to the router, then downloaded the driver and unzipped and untarred, and then ran the `make` file and `make install` as per the instructions, but had to run export ARCH=arm64 before that, because it was looking in aarch64. Then rebooted. Then
chicken@chicken:~$ sudo nmcli device wifi connect 'ssid' password 'password'
[sudo] password for chicken:
Device 'wlan0' successfully activated with '3a7997e6-c6b1-40f7-bf93-fba5b110282c'.
A lot of research will have to happen again now, though, because NVIDIA has its own software ecosystem. I’ll need a vision solution that is portable to the PI, with the hope that a model or neural net trained on the Jetson will still be able to run on the Raspberry Pi, since it’s 40X cheaper.
The lingo takes some time to get used to, but I believe JetPack is the name for this OS of preinstalled nvidia docs and libraries.
Since last year, an algorithm called… a Transformer… which has just recently created a hell of a chat bot, with GPT-3, and which underlies Google search as BERT (Bidirectional Encoder Representations from Transformers).
And there are hybrid convolutional nets and transformers, eg. DETR, and there are the SOTA from last year, EfficientNet, and then for some instances, or most, YOLOv4 is meant to be the new hot algorithm. It’s bigger than YOLOv3. It’s wait, so it’s more frames per second, and the accuracy (AP) is kinda so/so, at 5% less. I realised YOLOv5, which I had seen, which is a Pytorch implementation, is faster, though it’s technically just some one being a bit of a douchebag and calling his implementation of the author’s peer reviewed, the next version, YOLOv5. So what now?
YOLOv4 vs. YOLOv5
So, PyTorch vs. Tensorflow ( vs. TensorRT),
NVIDIA has this 3d simulator environment in Unreal Engine! Isaac. Something like an API for robots, by NVIDIA. They got this robot working with it, apparently.
It’s actually pretty good. I wonder if this https://developer.nvidia.com/deepstream-sdk is as cool as it sounds. Ah, closed source. Of course. But I can apply to join. Eh maybe.
So, I want to get these chickens into a convolutional neural network, or a transformer and output a pretty picture. I want the colour masks, not the bounding boxes.
I don’t want to get too caught up in proprietary NVIDIA specific API, even if they have an Unreal Engine simulator. But it might be worth checking out. GStreamer is an open source port of it, so maybe back on the menu.
But it’s a whole integrated thing. “The DeepStream SDK can be used to build end-to-end AI-powered applications to analyze video and sensor data. Some popular use cases are: retail analytics, parking management, managing logistics, robotics, optical inspection and managing operations.”
Nice. DeepStream supports several popular networks out of the box such as YOLO, FasterRCNN, SSD, RetinaNet and MaskRCNN.
I get the sense that Gems in the DeepStream world of Isaac, are like, ROS nodes, offering services on a port. ORB is a Gem. Ultimately, a prediction, or reconstruction in 3d, of the shape of objects in the world, would be ideal. I’m only doing the colour map stuff because the colours are nice, and it looks more impressive. But ultimately I will need to pick the best tool for the job.
NVIDIA also has DIGITS, Deep Learning GPU Training System (DIGITS) … puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks.
So as you can see, there’s more to find out. But ultimately I will probably have to repeat the task of getting labelled data into folders, and having the labels in the right format. Then generating the TFRecords, or doing whatever you do in PyTorch, I’m still biased to the TensorFlow ODI 2 implementation, because Google’s got the best dataset of chickens.
“Meta-World is an open-source simulated benchmark for meta-reinforcement learning and multi-task learning consisting of 50 distinct robotic manipulation tasks. We aim to provide task distributions that are sufficiently broad to evaluate meta-RL algorithms’ generalization ability to new behaviors.”
A couple of advanced data science algorithms. Implemented both for the walking table. ARS is great. We hear a lot about deep learning. This one is shallow learning, and does very well on simpler tasks. Just inputs and outputs, no hidden layers.
It’s similar to the Evolution Strategies algorithm. Generally trying some random stuff out, and slowly changing the model based on what gets you closer to the goal.
Since this is the meaty bits where the robot is configured, we need to understand links and joints first. Here is the bit from the docs:
Ok so that’s like the basics, which is pretty confusingly worded here. But basically, because the body isn’t counted as a link, you have the same number of links as joints, since you add a joint and link, each time you add a new body part.
Maybe good to recall the 6 degrees of freedom:
So the one function is localCreateRigidBody,
you can follow along in the code (We’re looking at TestRig) here:
If mass is not 0, it is ‘dynamic’. If it’s dynamic, calculate the local inertia on the shape, using the mass and local inertia.
The default motion state uses the startTransform
The rigid body construction info takes mass, motion state, shape and local inertia. body is then pointing to the rigid body created using the construction info.
The rigid body is added to the world, and returned.
So the Test Rig takes the world, a vector position offset and a boolean of whether it is fixed or not.
The comment says // Setup geometry
Body size is 0.25, legs length are 0.45, foreleg lengths are 0.75 long.
shapes[0] is a capsule shape of ‘body size’ radius, and 0.1 scalar height. Then it iterates through the number of legs, adding in shapes[i] as leg capsules with .1 radius and leg length, and shapes[i+1] as foreleg capsules with .08 radius and foreleg length.
Next is // Setup rigid bodies
It just sets up the offset origin.
Next is // root
Root’s vector is set as 0.5 height, and a transform set up with origin at root. If the ‘fixed’ variable is set, the root body is set up with (mass, startTransform, and shape) of 0, offset*transform, and the capsule shape set up just now. If ‘fixed’ variable is not set, the mass is 1, same place, same shape.
Next is // legs
This bit is complicated. Angle is 2*PI*i/NUM_LEGS, to evenly distribute the legs around the root body. fSin is the sin (opposite over hypotenuse), and fCos is the cos (adjacent over hypotenuse).
So, set transform to identity. That just means like 1,1,1. Multiply a vector by identity and you get the vector, I think.
vBoneOrigin is a vector: X: cos (bodysize + 0.5*leg length), Y: 0.5, Z: sin (bodysize + 0.5*leg length)
then set the transform to this vector. So, this is to find where the joint is, between the body, and the thigh. Same starting height for all legs, but X and Z plane goes around in circle.
// thigh
vToBone = (vBoneOrigin – vRoot).normalize()
Normalizing makes a unit vector (changes length to 1). So Bone Origin minus Root, normalized, means a unit vector in the direction of root, from the new thigh joint.
vAxis = vToBone.cross(vUp) where vUp is a unit vector on the y axis.
Ok dunno. So… cross product is like a normal vector, perpendicular to both vToBone, and vUp, using a right hand rule. Both are unit vectors. So it’s like a weird sideways vector. Ok but that is called vAxis. AH. ok, so the axis is the freaking axis to rotate around. Duh. And it rotates on that axis, by PI/2. By 90 degrees.
The quaternion is just the mathematical system rotating it by 90 degrees. It must have rotated the transform to point downwards.
X: cos (body + leg) Y: height – 0.5*foreleg, Z: sin (body + leg),
Ok so we already rotated 90 degrees, and now we have to move the transform to the axis for the foreleg. So the Y axis is the same for all of them, and then when you have cos and sin, that makes a circle around the y axis, at the (body+leg) distance. That might just make sense.
// Setup some damping on the m_bodies
setDamping, setDeactivationTime, setSleepingThresholds must be so the robot chills for a bit and acts a bit more earth-like.
Ok so A is rotated around Y axis by the current angle, and A is moved to bodySize outwards.
OK so for B… the inverse of the leg’s ‘world transform’, times the body’s ‘world transform’, times A. The body is presumably like sitting up straight, and A rotates it about Y, and then the inverse of the leg’s transform makes it point back towards origin.
And then the hinge constraint is set up with body and thigh, A and B. Limits set to -.75*PI/4 to PI/8. Apparently -33.75 degrees to 22.5 degrees.