We checked out
https://aliaksandrsiarohin.github.io/first-order-model-website/
to make some deep fakes for fun. Good stuff. Not sure this will be useful for this project.
We checked out
https://aliaksandrsiarohin.github.io/first-order-model-website/
to make some deep fakes for fun. Good stuff. Not sure this will be useful for this project.
From a website called robots.com, which is a great URL, but which otherwise has broken links,
The SCARA or Selective Compliant Assembly (or Articulated) Robot Arm robot provides a circular work envelope.
Here’s from allaboutcircuits.com
Articulated vs. SCARA vs. Cartesian Robots
Hmm. interesting. I think we’re doing an Articulated build for the gripper. It will need Z axis movement. But SCARA is cool if you don’t need Z axis (up down).
After checking out the speed of image segmentation on the Raspberry Pi (like one frame every 10 seconds maybe?), and my i3 laptop not being much better, I realised I needed more computing power, at least to train the neural networks. I can probably still ultimately run the neural network on the Pi, but we’ll see.
Looking at computing options, I ultimately went with the $399 NVIDIA Jetson Xavier NX.
Developer Kit Technical Specifications | |
GPU | NVIDIA Volta™ architecture with 384 NVIDIA® CUDA® cores and 48 Tensor cores |
CPU | 6-core NVIDIA Carmel ARM®v8.2 64-bit CPU 6 MB L2 + 4 MB L3 |
DLA | 2x NVDLA Engines |
Vision Accelerator | 7-Way VLIW Vision Processor |
Memory | 8 GB 128-bit LPDDR4x 51.2GB/s |
Storage | microSD (Card not included) |
Video Encode | 2x 4Kp30 | 6x 1080p 60 | 14x 1080p30 (H.265/H.264) |
Video Decode | 2x 4Kp60 | 4x 4Kp30 | 12x 1080p60 | 32x 1080p30 (H.265) 2x 4Kp30 | 6x 1080p60 |16x 1080p30 (H.264) |
Camera | 2x MIPI CSI-2 D-PHY lanes |
Connectivity | Gigabit Ethernet, M.2 Key E (WiFi/BT included), M.2 Key M (NVMe) |
Display | HDMI and DP |
USB | 4x USB 3.1, USB 2.0 Micro-B |
Others | GPIOs, I2C, I2S, SPI, UART |
Mechanical | 103 mm x 90.5 mm x 34 mm |
Also, did you know they made a dystopic reboot retcon of The Jetsons, that 70s retro-futuristic Hanna Barbera cartoon, in comic form? An ice meteor destoyed Earth. They were lucky to have had a place in space to go, working for the Spacely Space Sprockets, incorporated. (YT link)
It took an hour to set up, and was mostly straightforward, though I had to get a ‘clover plug’ cable, and an SD card.
I used Etcher to load the latest 6GB Jetson Developer Kit SD image, and had a keyboard, mouse, and hdmi monitor that worked. So I was able to enter the wifi SSID and password while setting it up.
I learned that one option for headless installation is to use a USB cable from your computer to the micro-USB input of the Jetson. But ultimately this wasn’t necessary. I ran ipconfig on the Jetson, got an ip address, and connected with ssh.
After needing to change the wifi details, I used the usb cable, then connected with:
ssh chicken@192.168.55.1
sudo nmcli device wifi connect 'SSID' password 'PASSWORD'
Coming back to this later, I attempted the same, but with the Jetson Nano, instead of the Jetson Xavier, and it didn’t work. I learned that the Nano doesn’t come with a Wifi adapter.
I think with the Nano, (“B01”) you need a monitor to install. I tried multiple tutorials, ssh’ing 192.168.55.1, I tried using screen to connect to /dev/ttyACC0 at 115200 baud, nope. Looked at the forums, and it’s complicated. I didn’t try the USB UART because my USB-TTL converter’s cable colours are different.
Another method that worked, with the nano, is plugging the ethernet cable from the Nano directly into the wifi router. It then shows up on the router’s network.
Later, when trying to install a D-Link wifi ‘Wireless N Nano USB Adaptor’, ( for the love of God, just get an Edimax – they work out of the box), I connected over ssh with the ethernet cable from the jetson to the router, then downloaded the driver and unzipped and untarred, and then ran the `make` file and `make install` as per the instructions, but had to run export ARCH=arm64
before that, because it was looking in aarch64. Then rebooted. Then
chicken@chicken:~$ sudo nmcli device wifi connect 'ssid' password 'password' [sudo] password for chicken: Device 'wlan0' successfully activated with '3a7997e6-c6b1-40f7-bf93-fba5b110282c'.
A lot of research will have to happen again now, though, because NVIDIA has its own software ecosystem. I’ll need a vision solution that is portable to the PI, with the hope that a model or neural net trained on the Jetson will still be able to run on the Raspberry Pi, since it’s 40X cheaper.
The lingo takes some time to get used to, but I believe JetPack is the name for this OS of preinstalled nvidia docs and libraries.
Since last year, an algorithm called… a Transformer… which has just recently created a hell of a chat bot, with GPT-3, and which underlies Google search as BERT (Bidirectional Encoder Representations from Transformers).
And there are hybrid convolutional nets and transformers, eg. DETR, and there are the SOTA from last year, EfficientNet, and then for some instances, or most, YOLOv4 is meant to be the new hot algorithm. It’s bigger than YOLOv3. It’s wait, so it’s more frames per second, and the accuracy (AP) is kinda so/so, at 5% less. I realised YOLOv5, which I had seen, which is a Pytorch implementation, is faster, though it’s technically just some one being a bit of a douchebag and calling his implementation of the author’s peer reviewed, the next version, YOLOv5. So what now?
YOLOv4 vs. YOLOv5
So, PyTorch vs. Tensorflow ( vs. TensorRT),
NVIDIA has this 3d simulator environment in Unreal Engine! Isaac. Something like an API for robots, by NVIDIA. They got this robot working with it, apparently.
It’s actually pretty good. I wonder if this https://developer.nvidia.com/deepstream-sdk is as cool as it sounds. Ah, closed source. Of course. But I can apply to join. Eh maybe.
So, I want to get these chickens into a convolutional neural network, or a transformer and output a pretty picture. I want the colour masks, not the bounding boxes.
I don’t want to get too caught up in proprietary NVIDIA specific API, even if they have an Unreal Engine simulator. But it might be worth checking out. GStreamer is an open source port of it, so maybe back on the menu.
But it’s a whole integrated thing. “The DeepStream SDK can be used to build end-to-end AI-powered applications to analyze video and sensor data. Some popular use cases are: retail analytics, parking management, managing logistics, robotics, optical inspection and managing operations.”
Nice. DeepStream supports several popular networks out of the box such as YOLO, FasterRCNN, SSD, RetinaNet and MaskRCNN.
I get the sense that Gems in the DeepStream world of Isaac, are like, ROS nodes, offering services on a port. ORB is a Gem. Ultimately, a prediction, or reconstruction in 3d, of the shape of objects in the world, would be ideal. I’m only doing the colour map stuff because the colours are nice, and it looks more impressive. But ultimately I will need to pick the best tool for the job.
NVIDIA also has DIGITS, Deep Learning GPU Training System (DIGITS) … puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks.
So as you can see, there’s more to find out. But ultimately I will probably have to repeat the task of getting labelled data into folders, and having the labels in the right format. Then generating the TFRecords, or doing whatever you do in PyTorch, I’m still biased to the TensorFlow ODI 2 implementation, because Google’s got the best dataset of chickens.
But lots to check out from NVIDIA.
The revelation of GStreamer seems to be one of the big wins here. It looks like an Apache Camel for video pipelines.
One of the main decisions is how to train the Vision. We have an NVIDIA Jetson NX now, which can work on training in the background.
We will try Tensorflow 2 first, and if training is slow, we can try TensorFlow with TensorRT (TF-TRT).
But we’re starting from scratch. As the title suggests, we’re going to try get U-Net working. A neural network shaped like a U, for instance segmentation.
So, dev environment with virtual environments and pip? or Docker?
Let’s try Docker first. Some instructions here and here…
https://github.com/NVIDIA/nvidia-docker
https://www.tensorflow.org/install/docker
docker pull tensorflow/tensorflow:latest-gpu-jupyter
or
... # latest release w/ GPU support and Jupyter
#ok but we need NVIDIA container kit on the host:
sudo apt-get install curl
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get install -y nvidia-docker2
For the Jetson, we need to install NVIDIA container kit to get access to the host’s GPU.
Ok going for this one…
sudo docker pull tensorflow/tensorflow:2.4.1-gpu-jupyter
I prefer tagged versions to ‘latest’ because they’re probably more stable.
Working from Jupyter Notebook will be a good way to preserve the code, and if we can use Docker, let’s do that, because containers are easier to deal with, usually, than virtual python environments on a host. We’ll leave this for now because we need to prepare the data.
In the meantime, I need to redo the OID (Open Images) download with bounding boxes or segmentation mask info. Let’s go straight for segmentation, using the method we tried before.
Need dev setup basics. give me some curl and some pip3.
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
python3 get-pip.py
pip install openimages
WARNING: The script wheel is installed in ‘/home/chicken/.local/bin’ which is not on PATH.
ok…
export PATH=”/home/chicken/.local/bin:$PATH“
and again… pip install openimages
So we download some files with mask file names
wget https://storage.googleapis.com/openimages/v5/test-annotations-object-segmentation.csv
wget https://storage.googleapis.com/openimages/v5/validation-annotations-object-segmentation.csv
wget https://storage.googleapis.com/openimages/v5/train-annotations-object-segmentation.csv
I tried v6 in that URL, but nope. Whatever.
mkdir OID
mkdir OID/v6
cd OID/v6
mkdir csv
mkdir csv/full
mkdir images
mkdir images/Chicken
mkdir images/Chicken/train
mkdir images/Chicken/test
mkdir images/Chicken/validation
mkdir masks
mkdir masks/Chicken
mkdir masks/Chicken/train
mkdir masks/Chicken/test
mkdir masks/Chicken/validation
mkdir recordsTf
mkdir recordsTf/Chicken
mkdir recordsTf/Chicken/test
mkdir recordsTf/Chicken/train
mkdir recordsTf/Chicken/validation
Ok new website page. https://storage.googleapis.com/openimages/web/download.html
Ok seems like Google’s links are still using v5, so let’s stick with v5.
Need some egrep to find the related images.
egrep '/m/09b5t' csv/full/test-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-test-images-ids.txt
egrep '/m/09b5t' csv/full/validation-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-validation-images-ids.txt
egrep '/m/09b5t' csv/full/train-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-train-images-ids.txt
and now feed this into a downloader program. We can use the suggested downloader.py script. but I liked this bash function method. The downloader.py needs the files prefixed with the directory, which is a bit annoying. In Linux, you’d need to use sed to put the directory names in front of every line.
function getTestImages { echo wget $2 -O images/Chicken/test/$1.jpg >> csv/gettestimages.sh; }
export -f getTestImages
csvtool call getTestImages csv/test-images-urls.csv
bash csv/gettestimages.sh
function getValidationImages { echo wget $2 -O images/Chicken/validation/$1.jpg >> csv/getevaluationimages.sh; }
export -f getValidationImages
csvtool call getValidationImages csv/validation-images-urls.csv
bash csv/getevaluationimages.sh
function getTrainImages { echo wget $2 -O images/Chicken/train/$1.jpg >> csv/gettrainimages.sh; }
export -f getTrainImages
csvtool call getTrainImages csv/train-images-urls.csv
bash csv/gettrainimages.sh
This is a surprisingly epic task, all of this. Lots of Flickr accounts have closed, it seems, since 2018. Lots of 404s.
But ultimately quite a few pics of chickens:
2.3G ./images/Chicken/train
88M ./images/Chicken/validation
323M ./images/Chicken/test
2.7G ./images/Chicken
Now I need the PNG files that are the masks for these images.
It seems like these are the 16 zip files.
wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-0.zip through 16. Oh but it goes 0-9, then A-F.
So, ok how to automate this? bash or perl or python? ok..
for i in {0..9}; do wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-$i.zip; done
well good enough automation for now. if I used hex maybe I can loop 1..F in bash. Let’s compromise. I could have copy pasted in this time.
for i in {'a','b','c','d','e','f'}; do wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-$i.zip; done
They’re 262MB each file.
unzip *
2686684 files… yikes
ok i need to find the PNG masks associated with the JPG images. I can work this out but I am flying blind. Chicken is /m/09b5t –
ls -l | grep 09b5t
ls -l | grep 09b5t | wc -l
shows 2237 masks for Chickens. But we only have 1324 images of Chickens.
Ok I need to see pics on the jetson. Ultimately an RDP (remote desktop protocol would be best?). VNC server is an old code but it checks out. Followed these instructions. and connected to 192.168.101.109:5901
Nope. It’s comically small at 640×480.
Ok but yeah I guess I just wanted to see the pictures. But this isn’t really necessary yet, or practical over VNC. I want to verify that the PNG mask corresponds to the JPG image contents. I’ll probably use a Jupyter Notebook ultimately. (I do end up using Jupyter Lab.)
We’re configuring Tensorflow 2 or PyTorch to train some convolutional network with this segmentation data.
There’s the mappings are in these files:
train-annotations-object-segmentation.csv
test-annotations-object-segmentation.csv
validation-annotations-object-segmentation.csv
It’s got the mappings, and some extra factoids about where the Google data entry annotator people clicked with their wand selection tool, and a “Predicted IoU”, which is a big topic. We should hopefully only need the image to segmentation file mapping.
MaskPath
: name of the corresponding mask image.ImageID
: the image this mask lives in.LabelName
: the MID of the object class this mask belongs to.BoxID
: an identifier for the box within the image.BoxXMin
, BoxXMax
, BoxYMin
, BoxYMax
: coordinates of the box linked to the mask, in normalized image coordinates. Note that this is not the bounding box of the mask, but the starting box from which the mask was annotated. These coordinates can be used to relate the mask data with the boxes data.PredictedIoU
: if present, indicates a predicted IoU value with respect to ground-truth. This quality estimate is machine-generated based on human annotator behaviour. See [3] for details.Clicks
: if present, indicates the human annotator clicks, which provided guidance during the annotation process we carried out (See [3] for details). This field is encoded using the following format: X1 Y1 T1;X2 Y2 T2;X3 Y3 T3;...
. Xi Yi
are the coordinates of the click in normalized image coordinates. Ti
is the click type, value 0
indicates the annotator marks the point as background, value 1
as part of the object instance (foreground). These clicks can be interesting for researchers in the field of interactive segmentation. They are not necessary for users interested in the final masks only.Ok it’s the same name. Easy enough.
MaskPath,ImageID,LabelName,BoxID,BoxXMin,BoxXMax,BoxYMin,BoxYMax,PredictedIoU,Clicks
677c122b0eaa5d16_m04yx4_9a041d52.png,677c122b0eaa5d16,/m/04yx4,9a041d52,0.8875,0.960938,0.454167,0.720833,0.86864,0.95498 0.65197 1;0.89370 0.56579 1;0.94701 0.48968 0;0.91049 0.70010 1;0.93927 0.47160 1;0.90269 0.56068 0;0.92061 0.70749 0;0.92509 0.64628 0;0.92248 0.65188 1;0.93042 0.46071 1;0.93290 0.71142 1;0.94431 0.48783 0
We have our images downloaded…
Ok the masks folder is too big though. Let’s just do Chicken, ok? So we’ll delete any PNGs that don’t have m09b5t in their filename. And delete these zip files.
find . -type f -print0 | xargs --null grep -Z -L 'm09b5t' | xargs --null rm
Lol that deleted everything. Oops. Don’t do that. Ok download again…
We’ll process zip files one at a time.
unzip train-masks-0.zip -d ./masks (1 minute passes) cd masks find \! -name '*m09b5*png' -delete (30 seconds) mv * ../Chicken
1…2….3…
OK unzipstuff.sh
I automated the process.
chicken@jetson:~/OID/v6$ cat unzipstuff.sh
#!/bin/bash for i in 1 2 3 4 5 6 7 8 9 a b c d e f do eval "unzip train-masks-$i.zip -d masks/" cd masks find ! -name 'm09b5png' -delete mv /home/chicken/OID/v6/masks/* /home/chicken/OID/v6/Chicken cd .. done
I need to display the information somehow. Jupyter Lab (Notebooks) are probably the best way to display code, and run it interactively.
chicken@jetson:~$ jupyter notebook --generate-config
Writing default config to: /home/chicken/.jupyter/jupyter_notebook_config.py
chicken@jetson:~$ jupyter-lab
Ok so I wasn’t sure why I couldn’t connect to the server on the Jetson, but I’m able to run it at http://localhost:8888/ through an SSH tunnel.
ssh -L 8888:127.0.0.1:8888 chicken@192.168.101.109
I’m not sure what the difference between Lab and Notebook is, exactly, yet, either. But I think Notebook is a subset of Lab.
Ok so I’m trying to match JPGs and PNGs. Some interesting data, with multiple masks for some images, and no masks for some images.
I set up SAMBA to copy files over and investigate.
I see. The disturbing part is that no images in my test and validation folders matched any masks. But all of the train images had a match…
OH. train, validation and test ALL have their own 16 zip files of masks.
Good thing I automated that… ok so same thing, but changing ‘train’ to the ‘validation’ and ‘test’.
I did a programmatic test on the directories to see if any images were missing a mask:
for fname in os.listdir(test_images_dir):if len(glob.glob(test_masks_dir + "*" + fname[:-4] + "*")) == 0:
print(fname)
It’s looking better. Still some missing, but good enough now. Missing 6 validation masks, and 12 test masks. All training images have at least one mask
Number of Train images: 1122 Number of Train masks: 2237 Number of validation images: 44 Number of validation masks: 59 02a0f2858f27a7ba.jpg 01463f5494340d3d.jpg 00e71a70a2f669ff.jpg 05887f57bc232041.jpg 0d3da02e79f84dde.jpg 0ed7092c41c81d14.jpg Number of test images: 154 Number of test masks: 186 0e9be8b09f71f909.jpg 0913fbf6fa5c190e.jpg 0f8a38312499d209.jpg 0650a130d7f707b5.jpg 0a8a5aa471796fd5.jpg 0cc4722ca906f86c.jpg 04423d3f6f5b8e74.jpg 03bc7fbc956b3a9a.jpg 07621394c8ad0b47.jpg 000411001ff7dd4f.jpg 0e5ecc56e464dcb8.jpg 05600e8a393e3c3a.jpg
I’ll move these ones out of the folder.
mkdir ~/backup cd /home/chicken/OID/v6/images/Chicken/validation/ mv 02a0f2858f27a7ba.jpg ~/backup mv 01463f5494340d3d.jpg ~/backup mv 00e71a70a2f669ff.jpg ~/backup mv 05887f57bc232041.jpg ~/backup mv 0d3da02e79f84dde.jpg ~/backup mv 0ed7092c41c81d14.jpg ~/backup cd /home/chicken/OID/v6/images/Chicken/test/ mv 0e9be8b09f71f909.jpg ~/backup mv 0913fbf6fa5c190e.jpg ~/backup mv 0f8a38312499d209.jpg ~/backup mv 0650a130d7f707b5.jpg ~/backup mv 0a8a5aa471796fd5.jpg ~/backup mv 0cc4722ca906f86c.jpg ~/backup mv 04423d3f6f5b8e74.jpg ~/backup mv 03bc7fbc956b3a9a.jpg ~/backup mv 07621394c8ad0b47.jpg ~/backup mv 000411001ff7dd4f.jpg ~/backup mv 0e5ecc56e464dcb8.jpg ~/backup mv 05600e8a393e3c3a.jpg ~/backup Ok and now all the images have masks! Number of Train images: 1122 Number of Train masks: 2237 Number of validation images: 38 Number of validation masks: 59 Number of test images: 142 Number of test masks: 186
Momentous. Looking at the nicolas windt article, there might be some dead links. So let’s delete those images too.
find -size 0 -delete
Number of Train images: 982 Number of Train masks: 2237 Number of validation images: 32 Number of validation masks: 59 Number of test images: 130 Number of test masks: 186
Oof, still good. Let’s load a picture in Jupyter. Ok tensorflow has a loadimage function.
No module named 'tensorflow'
Right. We tried installing it with Docker. How will that even work? Eish, gotta read up on this.
Ok I already downloaded an NVIDIA-friendly tensorflow 3 weeks ago. Well, things move slowly, but all incremental gains move things forward. With experience you learn ways not to do things.
chicken@jetson:~/OID/v6/images$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
tensorflow/tensorflow 2.4.1-gpu-jupyter 64d8717296f8 3 weeks ago 5.71GB
dustynv/jetson-inference r32.5.0 ccc2a5f19dad 3 weeks ago 2.89GB
nvidia/cuda 11.0-base 2ec708416bb8 5 months ago 122MB
Ok the TF2 instructions say…
Start a GPU container, using the Python interpreter.
$ docker run -it --rm -v $(realpath ~/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-jupyter
Run a Jupyter notebook server with your own notebook directory (assumed here to be ~/notebooks
). To use it, navigate to localhost:8888
in your browser. So…
$ docker run -it --rm -v ~/notebooks:/tf/notebooks -p 8888:8888 tensorflow/tensorflow:2.4.1-gpu-jupyter
Error...
standard_init_linux.go:211: exec user process caused "exec format error"
And pip?
chicken@jetson:~$ pip3 install tensorflow
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement tensorflow
ERROR: No matching distribution found for tensorflow
Great. Sanity check…
docker run -it --rm tensorflow/tensorflow bash
standard_init_linux.go:211: exec user process caused "exec format error"
Ok. Right, Jetson is aarch64, not x86-64… so google is suggesting Archiconda. This is too much for now. What’s wrong with pip? Python 3.6.9 is supposed to work with TF2.4.1 https://pypi.org/project/tensorflow/ hmm i guess there’s just no aarch64 version of TF2 precompiled.
So… one option is switch to PyTorch. Other option is try archiconda. I’m going to try this: https://ngc.nvidia.com/catalog/containers/nvidia:l4t-ml
“The Machine learning container contains TensorFlow, PyTorch, JupyterLab, and other popular ML and data science frameworks such as scikit-learn, scipy, and Pandas pre-installed in a Python 3.6 environment. Get started on your AI journey quickly on Jetson with everything pre-installed in this container.”
docker pull nvcr.io/nvidia/l4t-ml:r32.5.0-py3
sudo docker run -it –rm –runtime nvidia –network host -v /home/chicken/OID:/opt/OID -v /home/chicken/notebooks:/opt/notebooks nvcr.io/nvidia/l4t-ml:r32.5.0-py3
ok now we’re cooking. (No chickens were cooked during the making of this.)
So now I’m back on track, at like step 0.
I’m working off the Keras U-Net code now, from https://keras.io/examples/vision/oxford_pets_image_segmentation/ because it’s one of the simplest CNNs out there, from 2015. I’ve also opened up another implementation because it has more useful examples for training.
Note though that due to U-Net’s simplicity, it is often used for medical computer vision applications, since there’s not so much deep learning magic going on. You can quite easily imagine the latent representation dwelling somehow, at the bottom of the U shaped neural network. It should give us something interesting.
Let’s find the latent representation of a chicken.
We need to correlate the images and masks. We can glob by file name. Probably good as anything. But should probably put it in arrays of arrays or something. One image, many masks. So like a map from an image filename, to a list of mask filenames. As python calls maps, ‘dictionaries’.
Ok amazing, that works. I can see image and mask, and they correspond.
At some point I need to transform these. Make them all 256×256 pixels or something like that. Hmm.
OK, I got the training running. I got the Jetson like a month ago now, probably.
Had to reduce the batch size and epoch size, to get rid of an Out of Memory error. Then had a sort of browser freeze.
I should really run a script like this, instead:
nohup train.py &
but instead i’m hoping i can run it in Jupyter and it just follows the execution, and doesn’t freeze up. Maybe if I remove some debugging text…
But the loss function wasn’t going anywhere, even after 50 epochs, overnight. The mask prediction is just all black.
And I need to restart the Docker to open the tensorboard port
For Docker users: In case you are running a Docker image of Jupyter Notebook server using TensorFlow’s nightly, it is necessary to expose not only the notebook’s port, but the TensorBoard’s port. Thus, run the container with the following command:
docker run -it -p 8888:8888 -p 6006:6006 \
tensorflow/tensorflow:nightly-py3-jupyter
or in my case,
sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/OID:/opt/OID -v /home/chicken/notebooks:/opt/notebooks nvcr.io/nvidia/l4t-ml:r32.5.0-py3
hmm the python 'magic' is not working
Ok so I ran tensorboard inside the docker terminal, instead of in the notebook. (You can do that by checking the container ID of 'docker ps' and calling 'docker exec -it <ID> bash')
python3 -m tensorboard.main --logdir=/opt/notebooks/logs
from tensorboard import notebook
import datetime
#%load_ext tensorboard
%reload_ext tensorboard
%tensorboard --logdir /opt/notebooks/logs
notebook.list()
notebook.display(port=6006, height=1000)
ok yeah so my ML model didn't learn shit.
Also apparently they don't have tensorflow 2 in this nvidia ML docker container. root@jetson:/opt/notebooks/logs# pip3 show tensorflow Name: tensorflow Version: 1.15.4+nv20.11
So how to debug? The images are converted to an n-dimensional array.
Got array with shape: (4, 256, 256, 1)
Ok things are going weird now, almost as I notice the TF version. It must be getting late.
Next day: Ok Nvidia has a TF2 docker, and it shares about half the layers with the other docker, so that’s cool: nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3
But it doesn’t have jupyter installed. Maybe I can copy the relevant bits from the Dockerfile. I’ve tried installing Jupyter and committing the docker, but “Failed building wheel for cffi”, some aarch64 issue.
RUN apt-get update && apt-get install -y libffi6 libffi-dev
Hard to find the nvidia docker files, and they only have l4t-base available.
# # JupyterLab Dockerfile bits # RUN pip3 install jupyter jupyterlab --verbose #RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager@2 RUN jupyter lab --generate-config RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')" CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & echo "allow 10 sec for JupyterLab to start @ http://localhost:8888 (password nvidia)" && echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && /bin/bash
- from https://github.com/dusty-nv/jetson-containers/blob/master/Dockerfile.ml
ok sweet jeebus, after a big detour, i am using this successfully.
chicken@jetson:~$ cat Dockerfile FROM docker.io/datamachines/jetsonnano-cuda_tensorflow_opencv:10.2_2.3_4.5.1-20210218 RUN pip3 install jupyter jupyterlab --verbose RUN jupyter lab --generate-config RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')" EXPOSE 6006 EXPOSE 8888 CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \ echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \ echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && \ /bin/bash chicken@jetson:~$ sudo docker build -t nx_setup . chicken@jetson:~$ sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/:/dmc nx_setup
finally. So, back to tensorflow, and running U-Net!
So, maybe I see a problem with the semantic segmentation, possibly, which is related to chickens being a category among other things, rather than a binary chickeness and non-chickenness :
SparseCategoricalCrossentropy
class” Use this crossentropy metric when there are two or more label classes. “
I only have one class. Chicken. So that won’t work. I need an egg dataset. Luckily this implementation has an example of an eye, and the veins, and that is why we want the U-Net, for the egg anomaly detection.
The problem’s symptom is that nothing is being learned during training. So maybe I’m using the wrong loss function.
I need to review instance segmentation “options”.
The loss function is currently measuring “the crossentropy metric between the labels and predictions.”
The reason I want instance segmentation is to differentiate between chickens, where possible. Panoptic segmentation actually makes the most sense for this project.
Panoptic segmentation uses a semantic network and an instance network, and uses them both, to deliver something like (“cat”,0), (“cat”,1), (“cat”,3)
…
COCO Panoptic API looks great, but it seems to need json to describe all of the PNG images. Bounding boxes seems unnecessary but COCO needs bounding boxes data.
We’ll start a new post on Panoptic Segmentation using COCO, and get back to Tensorflow 2 for U-Net, for semantic segmentation, when training on lit up eggs for in ovo sexing.
…
Update after a hiatus: I see a recent nnU-Net advancement… It’s a meta modelling process evolution thing. “self-configuring” for biomedical imaging. Hmm. Very interesting.
We’re not there yet. We just want to get a basic U-Net working.
I see too, Perceptilabs from W&B is released and they have some beautiful screenshots too, though not available on pip3 yet for aarch64. So it’s not an option at the moment.
So, for reminder, in this post, we’re trying to get basic U-Net segmentation working. Here’s a good explanation of it.
…
I’ve found another implementation of U-Net that seems a bit more plug and play. There is also a useful note here regarding U-Net and the number of classes. https://github.com/karolzak/keras-unet/issues/3
(173, 512, 512, 3) (173, 512, 512) vs (30, 512, 512) (30, 512, 512)
One of their notebooks looks like a promising notebook, the kz-isbi-challenge.py, and I rigged it to run on my data, and I get OOM. Out of Memory. But this is jupyter lab. Let’s not train it in jupyter lab. Seems like a bad idea. Like a common problem that there’s probably a solution to, but where the solution is probably, ‘use python, dumbass’ So, converted to py, and edited. Had to take out all the plotting code. Pity. But same problem.
I found a jetson-stats https://github.com/rbonghi/jetson_stats jtop program and though it only showed 6.2GB/8GB of RAM the whole time, (I wasn’t even using up all the RAM?), it did remind me that i’m in a Docker, and maybe I’m not using swap space, and that 8GB is probably not enough RAM for a conv net. The U-Net had 31 million params.
Trainable params: 31,030,593
ResourceExhaustedError: OOM when allocating tensor with shape[32,128,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc [[node functional_1/concatenate_3/concat (defined at <ipython-input-26-51303ee95255>:7) ]] Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info. [Op:__inference_test_function_3292]
Hmm. Well, about the docker swap space, docker will use the resources it can, on the host, which is gonna be just a bit less than whatever the host can handle. So when it crashed, It appears to me that it’s trying to load gpu memory, and only has 400MB or so.
2021-06-07 19:06:35.653219: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] total_region_allocated_bytes_: 404856832 memory_limit_: 404856832 available bytes: 0 curr_region_allocation_bytes_: 809713664 2021-06-07 19:06:35.653456: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Stats: Limit: 404856832 InUse: 395858688 MaxInUse: 404771584 NumAllocs: 540 MaxAllocSize: 69172736 Reserved: 0 PeakReserved: 0 LargestFreeBlock: 0
So that was the advice from the repo author, that you should check your threads to see if they’ve allocated memory already, leaving none for other processes. (top or ps -ef) to see processes running.
After killing jupyter, I left it training overnight, on 300 training images and masks, from our chicken dataset, and it ran out of memory. But it looks like it finished training before it crapped out, and this time, the Out of Memory (OOM) error had some bigger numbers.
2021-06-08 08:15:21.038084: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] total_region_allocated_bytes_: 1400856576 memory_limit_: 1400856576 available bytes: 0 curr_region_allocation_bytes_: 2801713152 2021-06-08 08:15:21.038151: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Stats: Limit: 1400856576 InUse: 616462592 MaxInUse: 1400851712 NumAllocs: 37528 MaxAllocSize: 1280887296 Reserved: 0 PeakReserved: 0 LargestFreeBlock: 0 And you can see the loss was decreasing. That's cool.
So that third, ghostly column, is the one we're watching. I think it's just not very good yet. But maybe I don't understand what it's doing, exactly, either. I am expecting that when I'm done here, it should be able to make the mask, from just the image.
The loss functions I’ve used have been,
model.compile( optimizer=Adam(), loss='binary_crossentropy', metrics=[iou, iou_thresholded] ) and model.compile( optimizer=SGD(lr=0.01, momentum=0.99), loss=jaccard_distance, metrics=[iou, iou_thresholded] )
So that was training with the second one, last night. I will continue with it for now. Jaccard distance is, union minus intersection, over union. Sounds good to me. Optimising, using Stochastic Gradient Descent, with some hyperparameters.
Let’s leave it training again. I’m also upping the ratio between training and validation data, from 50/50 to 80/20. why not.
Also, the code we had before, for the first U-Net attempt, in the ‘Chicken Vision.py’ notebook, seemed more memory efficient, because it was lazy loading the images. But maybe much of a muchness. We’ll see, perhaps.
So training isn’t working anymore, it seems.
W tensorflow/core/kernels/gpu_utils.cc:49] Failed to allocate memory for convolution redzone checking; skipping this check. This is benign and only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.
Followed by OOM. Benign.
Stats: Limit: 1403920384 InUse: 650411520 MaxInUse: 1403915520 NumAllocs: 37625 MaxAllocSize: 1266649600 Reserved: 0 PeakReserved: 0 LargestFreeBlock: 0
Ok we might need a cloud gpu. Jetson NX not cutting it.
From a while later, after cloud gpus, it is worth noting that there is a weed detection U-Net using two different loss functions, Dice loss, and ‘Focal Tversky loss’, and only has a 19,667 parameter NN. That’s orders of magnitude smaller, so I might want to come back and see how.
arxiv paper: https://arxiv.org/pdf/1801.00868.pdf
Panoptic segmentation picks an instance segmentation algorithm and a semantic segmentation algorithm.
Some notable papers are listed here, with the benchmarks of the best related githubs, https://paperswithcode.com/task/panoptic-segmentation
For example,
The architecture of the neural network has two pyramids, one for semantics (classes), and one to count the instances
After much circular investigation, i arrived at the notion that transfer learning from a pre-trained network, with the ‘fine tuning’ referring to adding a new class, is the way to go.
But we’re still suffering from not having found an example using PNG mask files. I can convert to COCO, and that might be what I do yet, because, like the dataset had their own Panoptic segmentation challenge and format. https://cocodataset.org/#panoptic-eval They seem to be winning this race. We’ll do COCO.
It will mostly involve writing or exporting info into json format, and following some terse, ambiguous instructions.
Another thing is that COCO wants bounding boxes too. So this will be an exercise in config generation to satisfy the COCO format requirements. I have the data from Open images, but COCO looks like the biggest game in town.
Then for algorithm, there’s numerous Pytorch libraries, especially a very relevant one, YOLACT Edge, using a ‘Darknet’ architecture, which is an old “Open Source Neural Networks in C”
Hmm. It’s more instance segmentation than panoptic, but looks like a good compromise, to aim for.
https://github.com/haotian-liu/yolact_edge – It uses bounding boxes, so what will I do with all these chicken masks?
YOLACTEdge arxiv paper
Otherwise, the tensorflow object detection tutorials are here:
https://github.com/tensorflow/models/tree/master/research/object_detection/colab_tutorials
The eager_few_shot_od_training_tflite.ipynb notebook also looks like a winner for showing how to add a new Duck class to a MobileNet architecture. YOLACT Edge has a MobileNet model available too.
I am sitting with a thousand or so JPGs of chickens with corresponding PNG masks, sorted into train/val/test datasets. I was hoping for the Keras UNet segmentation demo to work because I initially thought UNet will be ideal for the egg light camera, but now I’m back to the FAIR detectron2 woods, to find a panoptic segmentation solution.
Let’s try the YOLACT Edge one, because it’s based on YOLO, (You only look once), a single shot object detector algorithm, but which is also more commonly known for ‘You only live once’, an affirmation of often reckless behaviour. YOLACT stands for You Only Look At CoefficienTs. In this case it looks like the state of the art, and it’s been used on the Jetson before, which is promising. At 30 frames per second on the Jetson AGX, we’ll probably be getting 20 or so on the Jetson NX. Since that’s using Torch to TensorRT to speed it up, it seems like we should try it. I was initially averse to using NVIDIA specific software, but we should make the most of this hardware (if we can)
It’s not really panoptic segmentation. But it’s looking Good Enough™ like what we need, rather than what we thought we wanted.
Let’s try these instructions:
https://github.com/haotian-liu/yolact_edge/blob/master/INSTALL.md
We’ll try it on the NX. “Inside” the Docker. What’s our CUDA version?
nvcc --version
10.2
TensorRT should already be installed.
(On Nano, if nvcc not found, check out this link )
git clone https://github.com/NVIDIA-AI-IOT/torch2trt cd torch2trt sudo python3 setup.py install --plugins
Here’s from the COCO panoptic readme.
https://cocodataset.org/#format-data RELEVANT EXCERPT FOR….
For the panoptic task, each annotation struct is a per-image annotation rather than a per-object annotation. Each per-image annotation has two parts: (1) a PNG that stores the class-agnostic image segmentation and (2) a JSON struct that stores the semantic information for each image segment. In more detail:
annotation{ "image_id": int, "file_name": str, "segments_info": [segment_info],} segment_info{ "id": int, "category_id": int, "area": int, "bbox": [x,y,width,height], "iscrowd": 0 or 1,} categories[{"id": int, "name": str, "supercategory": str, "isthing": 0 or 1, "color": [R,G,B],}]
Ok, we can do this.
Right, so, if anything, we want to transfer learn from a trained neural network. There’s some interesting discussion about implementing your own transfer learning of a coco dataset, in keras-retinanet here, but we’re looking at using Yolact Edge, based on pytorch, so let’s not get distracted. We need to create the COCO dataset. I’ve put this off for so long.
We need the COCO categories that are already trained, and I see there is the 2018 api https://github.com/cocodataset/panopticapi which has the Panoptic challenge coco categories (panoptic_coco_categories.json) and ah ha this is what I have been searching for.
panopticapi/sample_data/panoptic_examples.json
After pretty printing with
python3 -m json.tool panoptic_examples.json
here’s the example, for this bit.
"images": [
{
"license": 2,
"file_name": "000000142238.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000142238.jpg",
"height": 427,
"width": 640,
"date_captured": "2013-11-20 16:47:35",
"flickr_url": "http://farm5.staticflickr.com/4028/5079131149_dde584ed79_z.jpg",
"id": 142238
},
{
"license": 1,
"file_name": "000000439180.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000439180.jpg",
"height": 360,
"width": 640,
"date_captured": "2013-11-19 01:25:39",
"flickr_url": "http://farm3.staticflickr.com/2831/9275116980_1d9b986e3b_z.jpg",
"id": 439180
}
]
and we’ve got some images
./input_images/000000439180.jpg
./input_images/000000142238.jpg
and their masks.
./panoptic_examples/000000439180.png
./panoptic_examples/000000142238.png
Ah here’s ‘bird’ category.
{ "supercategory": "animal", "color": [ 165, 42, 42 ], "isthing": 1, "id": 16, "name": "bird" },
Ok hold on though. Let’s try get some visualisation working, before anything else. This looks like the ticket. But it is a python file, and running matplotlib, so ideally we’d transform this to a Jupyter Notebook. Ok, just New Notebook, copy paste. Run.
ModuleNotFoundError: No module named 'skimage' [Big Detour and to the rescue, Datamachines]
Ok we can install it with !pip3 install scikit-image ? No, that fails… what did I do, right, I need to ssh into the Jetson,
chrx@chrx:~$ ssh -L 8888:127.0.0.1:8888 -L 6006:127.0.0.1:6006 chicken@192.168.101.109
Then find the docker ID, and docker exec -it 519ed46162ae bash into it, and goddamnit what, UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 4029: ordinal not in range(128)
Ok so someone’s already had this happen, and it’s because the locale preferred encoding, needs to be UTF-8. But it’s some obscrure ANSI.
root@jetson:/# python -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968
Someone posted a bunch of steps for the L4T docker folks. That would be us. Do we really need this library ?
It’s to get this function.
from skimage.segmentation import find_boundaries
Yes, ok, it is quite hellish to install skimage. This was how to do it in debian, for skimage up to v. 0.13.1-2
apt install python3-skimage
But on it gets “ImportError: cannot import name ‘_validate_lengths'” which is resolved in 1.14.2
I’ve asked on the forum, and am hoping NVIDIA can solve this one. The skimage docs say:
As per the latest comment, (only 3 weeks ago, others were on the trail of similar tasks!), mmartial is pointing to datamachines, which has some Dockerfiles for building OpenCV and Tensorflow, and YOLOv4.
Ok, let’s try what the instructions suggest:
“make tensorflow_opencv
to build all the tensorflow_opencv
container images”
I’ll try the CuDNN version next if this doesn’t work…
Ok…we’re on step 16 of 42… Ooh Python 3.8, that’s an upgrade. Build those wheels, pip3! Doh, Step 24 of 42.
bazel: Exec format error
The command returned a non-zero code: 2
*Whomp whomp* sound
ok let’s try
make cudnn_tensorflow_opencv
, no…
I asked on the Issues, and they noticed those are the amd64 builds, not the aarch64 build. I could use their DockerHub pre-build for now.
so after a detour, i am using this Dockerfile successfully to run Jupyter on the NX. We got stuck because skimage was difficult to install, and now we’re back on track, annotating the COCO, and so on.
chicken@jetson:~$ cat Dockerfile FROM docker.io/datamachines/jetsonnano-cuda_tensorflow_opencv:10.2_2.3_4.5.1-20210218 RUN pip3 install jupyter jupyterlab --verbose RUN jupyter lab --generate-config RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')" EXPOSE 6006 EXPOSE 8888 CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \ echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \ echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && \ /bin/bash chicken@jetson:~$ sudo docker build -t nx_setup . chicken@jetson:~$ sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/:/dmc nx_setup
So, where were we?
Right. Panoptic API, we wanted to run visualize.py, first, so we could check progress. But it needed skimage installed. Haha. Ok, one week later… let’s try see the example.
Phew, ok. Getting back on track. So now we want to train it on the chickens.
So, COCO.
As someone teaching myself about this, I know what I ideally want is to transfer learn from a trained network. But it isn’t obvious how. I apparently need to chop off the last layer of a trained network, freeze most of the network, and then retrain the last bit.
Well, back to this soon…
So,
Here we have a suggestion from dbolya author of YOLACT and YOLACT++, the original.
try: self.load_state_dict(state_dict) except RuntimeError as e: print('Ignoring "' + str(e) + '"')
and then resume training fromyolact_im700_54_80000.pth
:python train.py --config=<your_config> --resume=weights/yolact_im700_54_800000.pth --start_iter=0
When there are size mismatches between tensors, Pytorch will spit out an error message but also keep on loading the rest of the tensors anyway. So here we just attempt to load a checkpoint with the wrong number of classes, eat the errors the Pytorch complains about, and then start training from iteration 0 with just those couple of tensors being untrained. You should see only the C (class) and S (semantic segmentation) losses reset.
You probably also want to modify the learning rate, decay schedule, and number of iterations in your config to account for fine-tuning.
And an allusion to an example of its use, perhaps. And more clues about how to fine tune the ‘network head’.
You can do this by following the fine tuning procedure (#36) and then here:
Line 628 in f54b0a5
p = pred_layer(pred_x)
replace that with
p = pred_layer(pred_x) p = pred_layer(pred_x.detach())
Ok… so here’s the YOLACT diagram:
A command to run the training.
Came across this metric for keeping track of:
True Positives
False Positives
True Negatives
False Negatives.
Might be useful for judging supervised learning of vision apps.
Confusion Matrix : https://en.wikipedia.org/wiki/Confusion_matrix
Ok here we go. I already installed gstreamer. Was still getting some h264 encoding plugin missing error, so I needed to add this:
apt-get install gstreamer1.0-libav
Then on my Ubuntu laptop (192.168.0.103), I run:
gst-launch-1.0 -v udpsrc port=1234 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! videoconvert ! autovideosink
And then to get webcam streaming from the Jetson,
video-viewer /dev/video0 rtp://192.168.0.103:1234
and similarly,
detectnet /dev/video0 rtp://192.168.0.103:1234
and
segnet /dev/video0 rtp://192.168.0.103:1234
It is impractical to run VNC because of the tiny resolution, and ssh -X tunnelling because that requires the host to have whatever drivers are used on the jetson. GStreamer is working well though.
Cool.
I’m trying to run a python program on the feed. Ended up finding same issue elsewhere. RTP output not working. Bumped the thread.
Someone worked it out. Needed a do/while loop instead of a for loop.
It’s been a couple of months since I was here, and it’s now time to load up this gstreamer code again, and see if I can stream a webcam feed from the robot, and we evaluate inference on the trained CNN, and we colour in the different classes in pretty colours, and
So, TODO:
So, practically though, I need to set up the Jetson again, and get it compiling the trained h5 file, and using webcam stills as input. Then gstream the result. Ok.
And fix power cable for Jetson step 1. Then try set up webcam gstreaming.
Ok, it’s plugged in, nmap found it. Let’s log in… Ok, run gstreamer, and then i’ve got a folder, jetson-inference, and I’m volume mapping it.
./docker/run.sh --volume /home/chicken/jetson-inference:/jetson-inference
import jetson.inference import jetson.utils net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5) camera = jetson.utils.videoSource("/dev/video0") display = jetson.utils.videoOutput("rtp://192.168.101.127:1234","--headless") # 'my_video.mp4' for file while True: img = camera.Capture() detections = net.Detect(img) display.Render(img) display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS())) if not camera.IsStreaming() or not display.IsStreaming(): break
I am in this dusty jetson-inference docker.
pip3 list appdirs (1.4.4) boto3 (1.16.58) botocore (1.19.58) Cython (0.29.21) dataclasses (0.8) decorator (4.4.2) future (0.18.2) jmespath (0.10.0) Mako (1.1.3) MarkupSafe (1.1.1) numpy (1.19.4) pandas (1.1.5) Pillow (8.0.1) pip (9.0.1) pycuda (2020.1) python-dateutil (2.8.1) pytools (2020.4.3) pytz (2020.5) s3transfer (0.3.4) setuptools (51.0.0) six (1.15.0) torch (1.6.0) torchaudio (0.6.0a0+f17ae39) torchvision (0.7.0a0+78ed10c) urllib3 (1.26.2) wheel (0.36.1)
and my nice docker with TF2 and everything installed already says:
root@jetson:/dmc/jetson-inference/build/aarch64/bin# ./my-detection.py
Traceback (most recent call last):
File "./my-detection.py", line 24, in
import jetson.inference
ModuleNotFoundError: No module named 'jetson'
Ok let’s try install jetson from source. First, tried the ‘Quick Reference’ instructions… Errored at
/dmc/jetson-inference/c/depthNet.h(190): error: identifier "COLORMAP_VIRIDIS_INVERTED" is undefined /dmc/jetson-inference/c/depthNet.h(180): error: identifier "COLORMAP_VIRIDIS_INVERTED" is undefined
Next, ran a command mentioned lower down,
git submodule update --init
Now make -j$(nproc)
gets to
/usr/bin/ld: cannot find -lnvcaffe_parser
And this reply suggests using sed to fix this…
sed -i ‘s/nvcaffe_parser/nvparsers/g’ CMakeLists.txt
Ok…Built! And…
[gstreamer] gstCamera -- attempting to create device v4l2:///dev/video0
[gstreamer] gstCamera -- didn't discover any v4l2 devices
[gstreamer] gstCamera -- device discovery and auto-negotiation failed
[gstreamer] gstCamera -- failed to create device v4l2:///dev/video0
Traceback (most recent call last):
File "my-detection.py", line 31, in
camera = jetson.utils.videoSource("/dev/video0")
Exception: jetson.utils -- failed to create videoSource device
Ok so docker/run.sh also has some other stuff going on, looking for V4L2 devices and such. Ok added device, and it’s working in my nice docker!
sudo docker run -it -p 8888:8888 -p 6006:6006 -p 8265:8265 --rm --runtime nvidia --network host --device /dev/video0 -v /home/chicken/:/dmc nx_setup
So what now… We open ‘/dev/video0’ for V4L2, then replace my inference code below.
img = camera.Capture() detections = net.Detect(img) display.Render(img) Ok ... zoom ahead and it's all worked out... --- After going on the ringer another time: zooming ahead, fixing that jetson issue, involves cd jetson-inference/build make install ldconfig --- Saved the sender program at the github To get video to the screen gst-launch-1.0 -v udpsrc port=1234 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! videoconvert ! autovideosink or to a mp4 gst-launch-1.0 -v udpsrc port=1234 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! x264enc ! qtmux ! filesink location=test.mp4 -e
We should check out rotary encoders and motors, while I’m in RSA. I’ve started looking at them now. There’s a wealth of info at a company that makes them, Dynapar.
Here’s the good stuff. Coupling: “Rotary encoders come in 3 major mounting styles: hollow-shaft (hollow-bore or through shaft), hub-shaft (hub-bore) or shafted. Hollow-shaft and hub-shaft rotary encoders mount directly to a motor shaft typically using a tether. Shafted rotary encoders mount using a flexible coupling.”
I asked a South African company about their Italian ‘Eltra‘ rotary encoders yesterday. Then Feetech from China asked this morning, if I needed servos. Well actually yes. I asked about their encoder motors, as they might be called, as a combination. They had 4.8V, 6V, and 7.4V such motors, whose encoders were of the magnetic measuring type, and give 12 bits of precision (2^12 = 4096, so 0 to 4095) as serial data.
The MG996Rs used for the basic table robot prototype were possibly adequate, but definitely on the hobby side. They felt more appropriate for a robot about half the size of the prototype. The prototype was successfully sat upon, by the chicken.
I started thinking about practicalities of powering a jetson nx (The devkit supports between 9-20V ) or jetson nano (5V) ot rpi (5V). and numerous motors. So probably uses a battery in the 9-20V range. Motor too. Probably 14.4V is what Lithium ion or similar batteries come as. Usually need to buy lithium ion batteries in the place where you’re going, rather than bring them on a plane. But you can always try. (Not a financial advisor (or otherwise)). They are probably still mostly imported for RC models (remote control, remember?).
But also wow, on the topic of rotary encoders and servos, this guy made a sub-millimeter accuracy by a 6 d.o.f. (degrees of freedom) robot arm
Used his own interesting and as yet mysterious encoder, which he put inside a motor. Says his robot arm cost EU 300. It’s super impressive.
The other option with power appears to be cool ‘just what you needed’ type products like DFRobot’s FIT0186, which has a built-in encoder! I ended up buying 4 x FIT0186s and 4 x FIT0185s (12V, 251RPM and 83RPM respectively).
I used an Arduino Nano, a DFRobot Dual motor driver, based on the TB6612 chip, plugging in the Nano for 5V, and 12V into the driver, to power the motors.
But after much wiring up, the motors work fine, but the encoders do not. I fixed up their example code to use volatile variables, and run the minimal amount of code in the interrupt service routines (ISRs), and even used two interrupts for a single motor, one for each hall sensor output, and put a 0.1mF capacitor between signal and ground. Tried CHANGE/RISING/FALLING triggers. Tried different motors.
I’ve looked at every forum post, and tried everything, and now I’ve posted on DFRobot’s forum, in case someone has got the encoder working. Not looking promising.
So back to the MG996Rs, I find that I ordered two batches from Mantech, supposedly the same genuine product from Tower Pro, and yet the one batch does not work properly with the PCA9685, uses a lower minimum pulse, and moves in the opposite direction to the other ones when using the same code. The duds keep creeping around, after you turn off the throttle.
Ok so I’m going to try with the one batch of MG996Rs that do seem to work, but we’ll need to make a new plan. Possibly, rigging an actual rotary encoder up, on the motor shaft. But for now, back to MG996Rs and the drawing board. We left the old robotable in Switzerland last year. So I made a new one with the remaining MG996Rs.
So I was able to get them fairly well calibrated by setting the min and max pulse, as that seems to be how servos work. They have a center PWM pulse, around 1500, where they are still, and then as you decrease the pulses per second, it goes one way, and as you increase the pulses per second, it goes the other way.
Unfortunately after much fuss, these continuous rotation servos turned out to be duds. At least for the way I was trying to use them.
Set the throttle to 1, sleep for a second. Set the throttle to zero, sleep for a second. Set the throttle to -1, sleep for a second. …And you have a new position. Unfortunately the positional control is unusable. I guess these are for wheels or an application where position is not so important.
Months have passed. It is 29 December 2020 now, and Miranda and I are at Bitwäscherei, in Zurich, during the remoteC3 (Chaos Computer Club)
We have a few things to get back to:
We plan to work on a chicken cam, but are waiting for the raspberry pi camera to arrive. So first, locomotion.
Miranda would like to add more leg segments, but process-wise, we need the software and hardware working together with basics, before adding more parts.
I’ve tested the GPIO, and it’s working, after adjusting the minimum pulse value for the MG996R servo.
So, where did we leave off, before KonS MFRU / ICAF ?
There was a walking table in simulation, and the progress was saving so that we could reload the state, in Ray and Tune.
I remember the best walker had a ‘reward’ of 902, so I searched for 902
grep -R ‘episode_reward_mean\”\: 902’
And found these files:
409982 Aug 1 07:52 events.out.tfevents.1596233543.chrx
334 Jul 31 22:12 params.json
304 Jul 31 22:12 params.pkl
132621 Aug 1 07:52 progress.csv
1542332 Aug 1 07:52 result.json
and there are checkpoint directories, with binary files.
So what are these files? How do I extract actions?
Well it looks like this info keeps track of Ray/Tune progress. If we want logs, we seem to need to make them ourselves. The original minitaur code used google protobuf to log state. So I set the parameter to log to a directory.
log_path="/media/chrx/0FEC49A4317DA4DA/logs"
So now when I run it again, it makes a file in the format below:
message RobotableEpisode {
// The state-action pair at each step of the log.
repeated RobotableStateAction state_action = 1;
}
message RobotableMotorState {
// The current angle of the motor.
double angle = 1;
// The current velocity of the motor.
double velocity = 2;
// The current torque exerted at this motor.
double torque = 3;
// The action directed to this motor. The action is the desired motor angle.
double action = 4;
}
message RobotableStateAction {
// Whether the state/action information is valid. It is always true if the
// proto is from simulation. It might be false when communication error
// happens on robotable hardware.
bool info_valid = 6;
// The time stamp of this step. It is computed since the reset of the
// environment.
google.protobuf.Timestamp time = 1;
// The position of the base of the minitaur.
robotics.messages.Vector3d base_position = 2;
// The orientation of the base of the minitaur. It is represented as (roll,
// pitch, yaw).
robotics.messages.Vector3d base_orientation = 3;
// The angular velocity of the base of the minitaur. It is the time derivative
// of (roll, pitch, yaw).
robotics.messages.Vector3d base_angular_vel = 4;
// The motor states (angle, velocity, torque, action) of eight motors.
repeated RobotableMotorState motor_states = 5;
}
I’m pretty much only interested in that last line,
repeated RobotableMotorState motor_states = 5;
So that’s the task, to decode the protobuf objects.
import os
import inspect
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)
import argparse
from gym_robotable.envs import logging
if __name__ == "__main__":
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/logs/robotable_log_2020-12-29-191602')
args = parser.parse_args()
logging = logging.RobotableLogging()
episode = logging.restore_episode(args.log_file)
print(dir (episode))
print("episode=",episode)
fields = episode.ListFields()
for field in fields:
print(field)
This prints out some json-like info. On the right path.
time {
seconds: 5
}
base_position {
x: 0.000978083148860855
y: 1.7430418253385236
z: -0.0007063670972221042
}
base_orientation {
x: -0.026604138524100478
y: 0.00973575985636693
z: -0.08143286338936992
}
base_angular_vel {
x: 0.172553297157456
y: -0.011541306494121786
z: -0.010542314686643973
}
motor_states {
angle: -0.12088901254000686
velocity: -0.868766524998517
torque: 3.3721667267908284
action: 3.6539504528045654
}
motor_states {
angle: 0.04232669165311699
velocity: 1.5488756496627718
torque: 3.4934419908437704
action: 1.4116498231887817
}
motor_states {
angle: 0.8409251448232009
velocity: -1.617737108768752
torque: -3.3539541961507124
action: -3.7024881839752197
}
motor_states {
angle: 0.13926660037454777
velocity: -0.9575437158301312
torque: 3.563701581854714
action: 1.104300618171692
}
info_valid: true
])
We will have to experiment to work out how to translate this data to something usable. The servos are controlled with throttle, from -1 to 1.
Technically I should probably rewrite the simulation to output this “throttle” value. But let’s work with what we have, for now.
My first attempt will be to extract and normalize the torque values to get a sequence of actions… nope.
Ok so plan B. Some visualisation of the values should help. I have it outputting the JSON now.
episode_proto = logging.restore_episode(args.log_file)
jsonObj = MessageToJson(episode_proto)
print(jsonObj)
I decided to use StreamLit, which is integrated with various plotting libraries. After looking at the different plotting options, Plotly seems the most advanced.
Ok so in JSON,
{
"time": "1970-01-01T00:00:05Z",
"basePosition": {
"x": 0.000978083148860855,
"y": 1.7430418253385236,
"z": -0.0007063670972221042
},
"baseOrientation": {
"x": -0.026604138524100478,
"y": 0.00973575985636693,
"z": -0.08143286338936992
},
"baseAngularVel": {
"x": 0.172553297157456,
"y": -0.011541306494121786,
"z": -0.010542314686643973
},
"motorStates": [
{
"angle": -0.12088901254000686,
"velocity": -0.868766524998517,
"torque": 3.3721667267908284,
"action": 3.6539504528045654
},
{
"angle": 0.04232669165311699,
"velocity": 1.5488756496627718,
"torque": 3.4934419908437704,
"action": 1.4116498231887817
},
{
"angle": 0.8409251448232009,
"velocity": -1.617737108768752,
"torque": -3.3539541961507124,
"action": -3.7024881839752197
},
{
"angle": 0.13926660037454777,
"velocity": -0.9575437158301312,
"torque": 3.563701581854714,
"action": 1.104300618171692
}
],
"infoValid": true
}
Plotly uses Panda dataframes, which is tabular data. 2 dimensions. So I need to transform this to something usable.
Something like time on the x-axis
and angle / velocity / torque / action on the y axis.
Ok so how to do this…?
Well I’ve almost got it, but I mostly had to give up on StreamLit’s native line_chart for now. Plotly’s has line chart code that can handle multiple variables. So I’m getting sidetracked by this bug:
When I import plotly’s library,
import plotly.graph_objects as go
“No module named ‘plotly.graph_objects’; ‘plotly’ is not a package”
https://stackoverflow.com/questions/57105747/modulenotfounderror-no-module-named-plotly-graph-objects
import plotly.graph_objects as go ? no…
from plotly import graph_objs as go ? no…
from plotly import graph_objects as go ? no…
hmm.
pip3 install plotly==4.14.1 ? Requirement already satisfied: plotly==4.14.1 in /usr/local/lib/python3.6/dist-packages (4.14.1)
no… why are the docs wrong then?
Ah ha.
“This is the well known name shadowing trap.” – stackoverflow
I named my file plotly.py – that is the issue.
So, ok run it again… (streamlit run plot.py) and open localhost:8501…
Now,
Running as root without --no-sandbox is not supported. See https://crbug.com/638180
Ah ha. I went back to StreamLit notation and it worked.
#fig.show()
st.plotly_chart(fig)
Ok excellent, so here is my first round of code:
import pandas as pd
import numpy as np
import streamlit as st
import time
from plotly import graph_objects as go
import os
import inspect
from google.protobuf.json_format import MessageToJson
import argparse
from gym_robotable.envs import logging
currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)
if __name__ == "__main__":
st.title('Analyticz')
parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/logs/robotable_log_2020-12-29-191602')
args = parser.parse_args()
logging = logging.RobotableLogging()
episode_proto = logging.restore_episode(args.log_file)
times = []
angles = [[]]*4 # < bugs!
velocities = [[]]*4
torques = [[]]*4
actions = [[]]*4
for step in range(len(episode_proto.state_action)):
step_log = episode_proto.state_action[step]
times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos))
for i in range(4):
angles[i].append(step_log.motor_states[i].angle)
velocities[i].append(step_log.motor_states[i].velocity)
torques[i].append(step_log.motor_states[i].torque)
actions[i].append(step_log.motor_states[i].action)
print(angles)
print(times)
print(len(angles))
print(len(velocities))
print(len(torques))
print(len(actions))
print(len(times))
# Create traces
fig = go.Figure()
fig.add_trace(go.Scatter(x=times, y=angles[0],
mode='lines',
name='Angles'))
fig.add_trace(go.Scatter(x=times, y=velocities[0],
mode='lines+markers',
name='Velocities'))
fig.add_trace(go.Scatter(x=times, y=torques[0],
mode='markers',
name='Torques'))
fig.add_trace(go.Scatter(x=times, y=actions[0],
mode='markers',
name='Actions'))
st.plotly_chart(fig)
And it’s plotting data for one leg.
If this is just 5 seconds of simulation, then velocities looks like it might be the closest match. You can imagine it going up a bit, back a bit, then a big step forward.
So, one idea is to do symbolic regression, to approximate the trigonometry equations for quadrupedal walking, (or just google them), and generalise to a walking algorithm, to use for locomotion. I could use genetic programming, like at university (https://gplearn.readthedocs.io/en/stable/examples.html#symbolic-regressor). But that’s overkill and probably won’t work. Gotta smooth the graph incrementally. Normalize it.
Let’s see what happens next, visually, after 5 seconds of data, and then view the same, for other legs.
Ok there is 30 seconds of walking.
The tools I wrote for the walker, are run with ‘python3 play_tune.py –replay 1’. It looks for the best checkpoint and replays it from there.
But now I seem to be getting the same graph for different legs. What? We’re going to have to investigate.
Ok turns out [[]]*4
is the wrong way to initialise arrays in python. It makes all sublists the same. Here’s the correct way:
velocities = [[] for i in range(4)]
Now I have 4 different legs.
The graph is very spiky, so I’ve added a rolling window average, and normalised it between -1 and 1 since that’s what the servo throttle allows.
I am thinking that maybe because the range between min and max for the 4 legs are:
3.1648572819886085
1.7581604444983845
5.4736002843351805
1.986915632875287
The rear legs aren’t moving as much, so maybe it doesn’t make sense to normalize them all to [-1, 1] all on the same scale. Like maybe the back right leg that moves so much should be normalized to [-1, 1] and then all the other legs are scaled down proportionally. Anyway, let’s see. Good enough for now.
In the code, the motors order is:
front right, front left, back right, back left.
Ok so to save the outputs…
import pandas as pd import numpy as np import streamlit as st import time from plotly import graph_objects as go import os import inspect from google.protobuf.json_format import MessageToJson import argparse from gym_robotable.envs import logging import plotly.express as px currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) parentdir = os.path.dirname(os.path.dirname(currentdir)) os.sys.path.insert(0, parentdir) def normalize_negative_one(img): normalized_input = (img - np.amin(img)) / (np.amax(img) - np.amin(img)) return 2*normalized_input - 1 if name == "main": st.title('Analyticz') parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/walkinglogs/robotable_log_2021-01-17-231240') args = parser.parse_args() logging = logging.RobotableLogging() episode_proto = logging.restore_episode(args.log_file) times = [] velocities = [[] for i in range(4)] for step in range(len(episode_proto.state_action)): step_log = episode_proto.state_action[step] times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos)) for i in range(4): velocities[i].append(step_log.motor_states[i].velocity) #truncate because a bunch of trailing zeros velocities[0] = velocities[0][0:3000] velocities[1] = velocities[1][0:3000] velocities[2] = velocities[2][0:3000] velocities[3] = velocities[3][0:3000] times = times[0:3000] #get moving averages window_size_0=40 numbers_series_0 = pd.Series(velocities[0]) windows_0 = numbers_series_0.rolling(window_size_0) moving_averages_0 = windows_0.mean() moving_averages_list_0 = moving_averages_0.tolist() without_nans_0 = moving_averages_list_0[window_size_0 - 1:] window_size_1=40 numbers_series_1 = pd.Series(velocities[1]) windows_1 = numbers_series_1.rolling(window_size_1) moving_averages_1 = windows_1.mean() moving_averages_list_1 = moving_averages_1.tolist() without_nans_1 = moving_averages_list_1[window_size_1 - 1:] window_size_2=40 numbers_series_2 = pd.Series(velocities[2]) windows_2 = numbers_series_2.rolling(window_size_2) moving_averages_2 = windows_2.mean() moving_averages_list_2 = moving_averages_2.tolist() without_nans_2 = moving_averages_list_2[window_size_2 - 1:] window_size_3=40 numbers_series_3 = pd.Series(velocities[3]) windows_3 = numbers_series_3.rolling(window_size_3) moving_averages_3 = windows_3.mean() moving_averages_list_3 = moving_averages_3.tolist() without_nans_3 = moving_averages_list_3[window_size_3 - 1:] #normalize between -1 and 1 avg_0 = np.asarray(without_nans_0) avg_1 = np.asarray(without_nans_1) avg_2 = np.asarray(without_nans_2) avg_3 = np.asarray(without_nans_3) avg_0 = normalize_negative_one(avg_0) avg_1 = normalize_negative_one(avg_1) avg_2 = normalize_negative_one(avg_2) avg_3 = normalize_negative_one(avg_3) np.save('velocity_front_right', avg_0) np.save('velocity_front_left', avg_1) np.save('velocity_back_right', avg_2) np.save('velocity_back_left', avg_3) np.save('times', times) # Create traces fig0 = go.Figure() fig0.add_trace(go.Scatter(x=times, y=velocities[0], mode='lines', name='Velocities 0')) fig0.add_trace(go.Scatter(x=times, y=avg_0.tolist(), mode='lines', name='Norm Moving Average 0')) st.plotly_chart(fig0) fig1 = go.Figure() fig1.add_trace(go.Scatter(x=times, y=velocities[1], mode='lines', name='Velocities 1')) fig1.add_trace(go.Scatter(x=times, y=avg_1.tolist(), mode='lines', name='Norm Moving Average 1')) st.plotly_chart(fig1) fig2 = go.Figure() fig2.add_trace(go.Scatter(x=times, y=velocities[2], mode='lines', name='Velocities 2')) fig2.add_trace(go.Scatter(x=times, y=avg_2.tolist(), mode='lines', name='Norm Moving Average 2')) st.plotly_chart(fig2) fig3 = go.Figure() fig3.add_trace(go.Scatter(x=times, y=velocities[3], mode='lines', name='Velocities 3')) fig3.add_trace(go.Scatter(x=times, y=avg_3.tolist(), mode='lines', name='Norm Moving Average 3')) st.plotly_chart(fig3)
(Excuse the formatting.) Then I’m loading those npy files and iterating them to the motors.
import time import numpy as np from board import SCL, SDA import busio from adafruit_pca9685 import PCA9685 from adafruit_motor import servo i2c = busio.I2C(SCL, SDA) pca = PCA9685(i2c, reference_clock_speed=25630710) pca.frequency = 50 servo0 = servo.ContinuousServo(pca.channels[0], min_pulse=685, max_pulse=2280) servo1 = servo.ContinuousServo(pca.channels[1], min_pulse=810, max_pulse=2095) servo2 = servo.ContinuousServo(pca.channels[2], min_pulse=700, max_pulse=2140) servo3 = servo.ContinuousServo(pca.channels[3], min_pulse=705, max_pulse=2105) velocity_front_right = np.load('velocity_front_right.npy') velocity_front_left = np.load('velocity_front_left.npy') velocity_back_right = np.load('velocity_back_right.npy') velocity_back_left = np.load('velocity_back_left.npy') times = np.load('times.npy') reverse left motors velocity_front_left = -velocity_front_left velocity_back_left = -velocity_back_left print (velocity_front_right.size) print (velocity_front_left.size) print (velocity_back_right.size) print (velocity_back_left.size) print (times.size) for time in times: print(time) for a,b,c,d in np.nditer([velocity_front_right, velocity_front_left, velocity_back_right, velocity_back_left]): servo0.throttle = a/4 servo1.throttle = b/4 servo2.throttle = c/4 servo3.throttle = d/4 print (a, b, c, d) servo0.throttle = 0 servo1.throttle = 0 servo2.throttle = 0 servo3.throttle = 0 pca.deinit()
Honestly it doesn’t look terrible, but these MG996R continuous rotation servos are officially garbage.
While we wait for new servos to arrive, I’m testing on SG90s. I’ve renormalized about 90 degrees
def normalize_0_180(img): normalized_0_180 = (180*(img - np.min(img))/np.ptp(img)).astype(int) return normalized_0_180
That was also a bit much variation, and since we’ve got 180 degree servos now, but it still looks a bit off, I halved the variance in the test.
velocity_front_right = ((velocity_front_right - 90)/2)+90 velocity_front_left = ((velocity_front_left - 90)/2)+90 velocity_back_right = ((velocity_back_right - 90)/2)+90 velocity_back_left = ((velocity_back_left - 90)/2)+90 # reverse left motors velocity_front_left = 180-velocity_front_left velocity_back_left = 180-velocity_back_left
lol. ok. Sim2Real.
So, it’s not terrible, but we’re not quite there either. Also i think it’s walking backwards.
I am not sure the math is correct.
I changed the smoothing code to use this code which smoothes based on the preceding plot.
def anchor(signal, weight): buffer = [] last = signal[0] for i in signal: smoothed_val = last * weight + (1 - weight) * i buffer.append(smoothed_val) last = smoothed_val return buffer
OK i realised I was wrong all along. Two things.
First, I just didn’t see that the angles values were on that original graph. They were so small. Of course we’re supposed to use the angles, rather than the velocities, for 180 degree servos.
Second problem was, I was normalizing from min to max of the graph. Of course it should be -PI/2 to PI/2, since the simulator works with radians, obviously. Well anyway, hindsight is 20/20. Now we have a fairly accurate sim2real. I use the anchor code above twice, to get a really smooth line.
Here’s the final code.
import pandas as pd import numpy as np import streamlit as st import time from plotly import graph_objects as go import os import inspect from google.protobuf.json_format import MessageToJson import argparse from gym_robotable.envs import logging import plotly.express as px currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe()))) parentdir = os.path.dirname(os.path.dirname(currentdir)) os.sys.path.insert(0, parentdir) def anchor(signal, weight): buffer = [] last = signal[0] for i in signal: smoothed_val = last * weight + (1 - weight) * i buffer.append(smoothed_val) last = smoothed_val return buffer assume radians def normalize_0_180(img): normalized_0_180 = np.array(img)*57.2958 + 90 return normalized_0_180 if name == "main": st.title('Analyticz') parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter) parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/walkinglogs/robotable_log_2021-01-17-231240') args = parser.parse_args() logging = logging.RobotableLogging() episode_proto = logging.restore_episode(args.log_file) times = [] angles = [[] for i in range(4)] for step in range(len(episode_proto.state_action)): step_log = episode_proto.state_action[step] times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos)) for i in range(4): print (step) print (step_log.motor_states[i].angle) angles[i].append(step_log.motor_states[i].angle) #truncate because a bunch of trailing zeros angles[0] = angles[0][0:3000] angles[1] = angles[1][0:3000] angles[2] = angles[2][0:3000] angles[3] = angles[3][0:3000] avg_0 = normalize_0_180(angles[0]) avg_1 = normalize_0_180(angles[1]) avg_2 = normalize_0_180(angles[2]) avg_3 = normalize_0_180(angles[3]) avg_0 = anchor(avg_0, 0.8) avg_1 = anchor(avg_1, 0.8) avg_2 = anchor(avg_2, 0.8) avg_3 = anchor(avg_3, 0.8) avg_0 = anchor(avg_0, 0.8) avg_1 = anchor(avg_1, 0.8) avg_2 = anchor(avg_2, 0.8) avg_3 = anchor(avg_3, 0.8) avg_0 = anchor(avg_0, 0.8) avg_1 = anchor(avg_1, 0.8) avg_2 = anchor(avg_2, 0.8) avg_3 = anchor(avg_3, 0.8) np.save('angle_front_right_180', avg_0) np.save('angle_front_left_180', avg_1) np.save('angle_back_right_180', avg_2) np.save('angle_back_left_180', avg_3) # Create traces fig0 = go.Figure() fig0.add_trace(go.Scatter(x=times, y=angles[0], mode='lines', name='Angles 0')) fig0.add_trace(go.Scatter(x=times, y=avg_0, mode='lines', name='Norm Moving Average 0')) st.plotly_chart(fig0) fig1 = go.Figure() fig1.add_trace(go.Scatter(x=times, y=angles[1], mode='lines', name='Angles 1')) fig1.add_trace(go.Scatter(x=times, y=avg_1, mode='lines', name='Norm Moving Average 1')) st.plotly_chart(fig1) fig2 = go.Figure() fig2.add_trace(go.Scatter(x=times, y=angles[2], mode='lines', name='Angles 2')) fig2.add_trace(go.Scatter(x=times, y=avg_2, mode='lines', name='Norm Moving Average 2')) st.plotly_chart(fig2) fig3 = go.Figure() fig3.add_trace(go.Scatter(x=times, y=angles[3], mode='lines', name='Angles 3')) fig3.add_trace(go.Scatter(x=times, y=avg_3, mode='lines', name='Norm Moving Average 3')) st.plotly_chart(fig3)
OK.
So there’s a milestone that took way too long. We’ve got Sim 2 Real working, ostensibly.
After some fortuitous googling, I found the Spot Micro, or, Spot Mini Mini project. The Spot Micro guys still have a big focus on inverse kinematics, which I’m trying to avoid for as long as I can.
They’ve done a very similar locomotion project using pyBullet, and I was able to find a useful paper, in the inspiration section, alerting me to kMPs.
Kinematic Motion Primitives. It’s a similar idea to what I did above.
Instead, what these guys did was to take a single wave of their leg data, and repeat that, and compare that to a standardized phase. (More or less). Makes sense. Looks a bit complicated to work out the phase of the wave in my case.
I’ll make a new topic, and try to extract kMPs from the data, for the next round of locomotion sim2real. I will probably also train the robot for longer, to try evolve a gait that isn’t so silly.