Categories
GANs

First Order Motion Model

We checked out

https://aliaksandrsiarohin.github.io/first-order-model-website/

to make some deep fakes for fun. Good stuff. Not sure this will be useful for this project.

Categories
form Hardware

SCARA

From a website called robots.com, which is a great URL, but which otherwise has broken links,


The SCARA or Selective Compliant Assembly (or Articulated) Robot Arm robot provides a circular work envelope.

Here’s from allaboutcircuits.com

Industrial Robotics Roundup:

Articulated vs. SCARA vs. Cartesian Robots

Hmm. interesting. I think we’re doing an Articulated build for the gripper. It will need Z axis movement. But SCARA is cool if you don’t need Z axis (up down).

Categories
AI/ML control dev Hardware hardware_ robots Vision

Jetson NX: A New Hope

After checking out the speed of image segmentation on the Raspberry Pi (like one frame every 10 seconds maybe?), and my i3 laptop not being much better, I realised I needed more computing power, at least to train the neural networks. I can probably still ultimately run the neural network on the Pi, but we’ll see.

Looking at computing options, I ultimately went with the $399 NVIDIA Jetson Xavier NX.

Developer Kit Technical Specifications
GPUNVIDIA Volta architecture
with 384 NVIDIA® CUDA® cores and 48 Tensor cores
CPU6-core NVIDIA Carmel ARM®v8.2 64-bit CPU
6 MB L2 + 4 MB L3
DLA2x NVDLA Engines
Vision Accelerator7-Way VLIW Vision Processor
Memory8 GB 128-bit LPDDR4x 51.2GB/s
StoragemicroSD (Card not included)
Video Encode2x 4Kp30 | 6x 1080p 60 | 14x 1080p30 (H.265/H.264)
Video Decode2x 4Kp60 | 4x 4Kp30 | 12x 1080p60 | 32x 1080p30 (H.265)
2x 4Kp30 | 6x 1080p60 |16x 1080p30 (H.264)
Camera2x MIPI CSI-2 D-PHY lanes
ConnectivityGigabit Ethernet, M.2 Key E (WiFi/BT included), M.2 Key M (NVMe)
DisplayHDMI and DP
USB4x USB 3.1, USB 2.0 Micro-B
OthersGPIOs, I2C, I2S, SPI, UART
Mechanical103 mm x 90.5 mm x 34 mm

Also, did you know they made a dystopic reboot retcon of The Jetsons, that 70s retro-futuristic Hanna Barbera cartoon, in comic form? An ice meteor destoyed Earth. They were lucky to have had a place in space to go, working for the Spacely Space Sprockets, incorporated. (YT link)

https://en.wikipedia.org/wiki/The_Jetsonshttps://en.wikipedia.org/wiki/The_Jetsons

It took an hour to set up, and was mostly straightforward, though I had to get a ‘clover plug’ cable, and an SD card.

I used Etcher to load the latest 6GB Jetson Developer Kit SD image, and had a keyboard, mouse, and hdmi monitor that worked. So I was able to enter the wifi SSID and password while setting it up.

I learned that one option for headless installation is to use a USB cable from your computer to the micro-USB input of the Jetson. But ultimately this wasn’t necessary. I ran ipconfig on the Jetson, got an ip address, and connected with ssh.

After needing to change the wifi details, I used the usb cable, then connected with:

ssh chicken@192.168.55.1
sudo nmcli device wifi connect 'SSID' password 'PASSWORD'

Coming back to this later, I attempted the same, but with the Jetson Nano, instead of the Jetson Xavier, and it didn’t work. I learned that the Nano doesn’t come with a Wifi adapter.

I think with the Nano, (“B01”) you need a monitor to install. I tried multiple tutorials, ssh’ing 192.168.55.1, I tried using screen to connect to /dev/ttyACC0 at 115200 baud, nope. Looked at the forums, and it’s complicated. I didn’t try the USB UART because my USB-TTL converter’s cable colours are different.

Another method that worked, with the nano, is plugging the ethernet cable from the Nano directly into the wifi router. It then shows up on the router’s network.

Later, when trying to install a D-Link wifi ‘Wireless N Nano USB Adaptor’, ( for the love of God, just get an Edimax – they work out of the box), I connected over ssh with the ethernet cable from the jetson to the router, then downloaded the driver and unzipped and untarred, and then ran the `make` file and `make install` as per the instructions, but had to run export ARCH=arm64 before that, because it was looking in aarch64. Then rebooted. Then

chicken@chicken:~$ sudo nmcli device wifi connect 'ssid' password 'password'
[sudo] password for chicken:
Device 'wlan0' successfully activated with '3a7997e6-c6b1-40f7-bf93-fba5b110282c'.

A lot of research will have to happen again now, though, because NVIDIA has its own software ecosystem. I’ll need a vision solution that is portable to the PI, with the hope that a model or neural net trained on the Jetson will still be able to run on the Raspberry Pi, since it’s 40X cheaper.

The lingo takes some time to get used to, but I believe JetPack is the name for this OS of preinstalled nvidia docs and libraries.

Since last year, an algorithm called… a Transformer… which has just recently created a hell of a chat bot, with GPT-3, and which underlies Google search as BERT (Bidirectional Encoder Representations from Transformers).

And there are hybrid convolutional nets and transformers, eg. DETR, and there are the SOTA from last year, EfficientNet, and then for some instances, or most, YOLOv4 is meant to be the new hot algorithm. It’s bigger than YOLOv3. It’s wait, so it’s more frames per second, and the accuracy (AP) is kinda so/so, at 5% less. I realised YOLOv5, which I had seen, which is a Pytorch implementation, is faster, though it’s technically just some one being a bit of a douchebag and calling his implementation of the author’s peer reviewed, the next version, YOLOv5. So what now?

Image for post

YOLOv4 vs. YOLOv5

https://github.com/ultralytics/yolov5#user-content-pretrained-checkpoints
it keeps getting better.

So, PyTorch vs. Tensorflow ( vs. TensorRT),

NVIDIA has this 3d simulator environment in Unreal Engine! Isaac. Something like an API for robots, by NVIDIA. They got this robot working with it, apparently.

NVIDIA Carter — ISAAC 2020.2 documentation

It’s actually pretty good. I wonder if this https://developer.nvidia.com/deepstream-sdk is as cool as it sounds. Ah, closed source. Of course. But I can apply to join. Eh maybe.

So, I want to get these chickens into a convolutional neural network, or a transformer and output a pretty picture. I want the colour masks, not the bounding boxes.

I don’t want to get too caught up in proprietary NVIDIA specific API, even if they have an Unreal Engine simulator. But it might be worth checking out. GStreamer is an open source port of it, so maybe back on the menu.

But it’s a whole integrated thing. “The DeepStream SDK can be used to build end-to-end AI-powered applications to analyze video and sensor data. Some popular use cases are: retail analytics, parking management, managing logistics, robotics, optical inspection and managing operations.”

Nice. DeepStream supports several popular networks out of the box such as YOLO, FasterRCNN, SSD, RetinaNet and MaskRCNN.

I get the sense that Gems in the DeepStream world of Isaac, are like, ROS nodes, offering services on a port. ORB is a Gem. Ultimately, a prediction, or reconstruction in 3d, of the shape of objects in the world, would be ideal. I’m only doing the colour map stuff because the colours are nice, and it looks more impressive. But ultimately I will need to pick the best tool for the job.

NVIDIA also has DIGITS, Deep Learning GPU Training System (DIGITS) … puts the power of deep learning into the hands of engineers and data scientists. DIGITS can be used to rapidly train the highly accurate deep neural network (DNNs) for image classification, segmentation and object detection tasks.

So as you can see, there’s more to find out. But ultimately I will probably have to repeat the task of getting labelled data into folders, and having the labels in the right format. Then generating the TFRecords, or doing whatever you do in PyTorch, I’m still biased to the TensorFlow ODI 2 implementation, because Google’s got the best dataset of chickens.

But lots to check out from NVIDIA.

The revelation of GStreamer seems to be one of the big wins here. It looks like an Apache Camel for video pipelines.

Categories
AI/ML CNNs deep dev Vision

TF2: U-Net

One of the main decisions is how to train the Vision. We have an NVIDIA Jetson NX now, which can work on training in the background.

We will try Tensorflow 2 first, and if training is slow, we can try TensorFlow with TensorRT (TF-TRT).

But we’re starting from scratch. As the title suggests, we’re going to try get U-Net working. A neural network shaped like a U, for instance segmentation.

So, dev environment with virtual environments and pip? or Docker?

Let’s try Docker first. Some instructions here and here…

https://github.com/NVIDIA/nvidia-docker

https://www.tensorflow.org/install/docker

docker pull tensorflow/tensorflow:latest-gpu-jupyter
or
... # latest release w/ GPU support and Jupyter


#ok but we need NVIDIA container kit on the host:

sudo apt-get install curl

distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
   && curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add - \
   && curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list

sudo apt-get install -y nvidia-docker2

For the Jetson, we need to install NVIDIA container kit to get access to the host’s GPU.

Ok going for this one…

sudo docker pull tensorflow/tensorflow:2.4.1-gpu-jupyter

I prefer tagged versions to ‘latest’ because they’re probably more stable.

Working from Jupyter Notebook will be a good way to preserve the code, and if we can use Docker, let’s do that, because containers are easier to deal with, usually, than virtual python environments on a host. We’ll leave this for now because we need to prepare the data.

OIDv6

In the meantime, I need to redo the OID (Open Images) download with bounding boxes or segmentation mask info. Let’s go straight for segmentation, using the method we tried before.

Need dev setup basics. give me some curl and some pip3.

curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py

python3 get-pip.py

pip install openimages

WARNING: The script wheel is installed in ‘/home/chicken/.local/bin’ which is not on PATH.

ok…

export PATH=”/home/chicken/.local/bin:$PATH

and again… pip install openimages

So we download some files with mask file names

wget https://storage.googleapis.com/openimages/v5/test-annotations-object-segmentation.csv
wget https://storage.googleapis.com/openimages/v5/validation-annotations-object-segmentation.csv
wget https://storage.googleapis.com/openimages/v5/train-annotations-object-segmentation.csv

I tried v6 in that URL, but nope. Whatever.

mkdir OID
mkdir OID/v6
cd OID/v6
mkdir csv
mkdir csv/full
mkdir images
mkdir images/Chicken
mkdir images/Chicken/train
mkdir images/Chicken/test
mkdir images/Chicken/validation
mkdir masks
mkdir masks/Chicken
mkdir masks/Chicken/train
mkdir masks/Chicken/test
mkdir masks/Chicken/validation
mkdir recordsTf
mkdir recordsTf/Chicken
mkdir recordsTf/Chicken/test
mkdir recordsTf/Chicken/train
mkdir recordsTf/Chicken/validation

Ok new website page. https://storage.googleapis.com/openimages/web/download.html

Ok seems like Google’s links are still using v5, so let’s stick with v5.

Need some egrep to find the related images.

egrep '/m/09b5t' csv/full/test-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-test-images-ids.txt

egrep '/m/09b5t' csv/full/validation-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-validation-images-ids.txt

egrep '/m/09b5t' csv/full/train-annotations-object-segmentation.csv | egrep -o ^[0-9a-f]* > csv/chicken-train-images-ids.txt

and now feed this into a downloader program. We can use the suggested downloader.py script. but I liked this bash function method. The downloader.py needs the files prefixed with the directory, which is a bit annoying. In Linux, you’d need to use sed to put the directory names in front of every line.

function getTestImages { echo wget $2 -O images/Chicken/test/$1.jpg >> csv/gettestimages.sh; }
export -f getTestImages

csvtool call getTestImages csv/test-images-urls.csv
bash csv/gettestimages.sh

function getValidationImages { echo wget $2 -O images/Chicken/validation/$1.jpg >> csv/getevaluationimages.sh; }
export -f getValidationImages

csvtool call getValidationImages csv/validation-images-urls.csv
bash csv/getevaluationimages.sh

function getTrainImages { echo wget $2 -O images/Chicken/train/$1.jpg >> csv/gettrainimages.sh; }
export -f getTrainImages

csvtool call getTrainImages csv/train-images-urls.csv
bash csv/gettrainimages.sh

This is a surprisingly epic task, all of this. Lots of Flickr accounts have closed, it seems, since 2018. Lots of 404s.

But ultimately quite a few pics of chickens:

2.3G ./images/Chicken/train
88M ./images/Chicken/validation
323M ./images/Chicken/test
2.7G ./images/Chicken

Now I need the PNG files that are the masks for these images.

It seems like these are the 16 zip files.

wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-0.zip through 16. Oh but it goes 0-9, then A-F.

So, ok how to automate this? bash or perl or python? ok..

for i in {0..9}; do wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-$i.zip; done

well good enough automation for now. if I used hex maybe I can loop 1..F in bash. Let’s compromise. I could have copy pasted in this time.

for i in {'a','b','c','d','e','f'}; do wget https://storage.googleapis.com/openimages/v5/train-masks/train-masks-$i.zip; done

They’re 262MB each file.

unzip *

2686684 files… yikes

ok i need to find the PNG masks associated with the JPG images. I can work this out but I am flying blind. Chicken is /m/09b5t –

ls -l | grep 09b5t

ls -l | grep 09b5t | wc -l

shows 2237 masks for Chickens. But we only have 1324 images of Chickens.

Ok I need to see pics on the jetson. Ultimately an RDP (remote desktop protocol would be best?). VNC server is an old code but it checks out. Followed these instructions. and connected to 192.168.101.109:5901

Nope. It’s comically small at 640×480.

VNC listening on port 5901

Ok but yeah I guess I just wanted to see the pictures. But this isn’t really necessary yet, or practical over VNC. I want to verify that the PNG mask corresponds to the JPG image contents. I’ll probably use a Jupyter Notebook ultimately. (I do end up using Jupyter Lab.)

We’re configuring Tensorflow 2 or PyTorch to train some convolutional network with this segmentation data.

There’s the mappings are in these files:

train-annotations-object-segmentation.csv
test-annotations-object-segmentation.csv
validation-annotations-object-segmentation.csv

It’s got the mappings, and some extra factoids about where the Google data entry annotator people clicked with their wand selection tool, and a “Predicted IoU”, which is a big topic. We should hopefully only need the image to segmentation file mapping.

  • MaskPath: name of the corresponding mask image.
  • ImageID: the image this mask lives in.
  • LabelName: the MID of the object class this mask belongs to.
  • BoxID: an identifier for the box within the image.
  • BoxXMinBoxXMaxBoxYMinBoxYMax: coordinates of the box linked to the mask, in normalized image coordinates. Note that this is not the bounding box of the mask, but the starting box from which the mask was annotated. These coordinates can be used to relate the mask data with the boxes data.
  • PredictedIoU: if present, indicates a predicted IoU value with respect to ground-truth. This quality estimate is machine-generated based on human annotator behaviour. See [3] for details.
  • Clicks: if present, indicates the human annotator clicks, which provided guidance during the annotation process we carried out (See [3] for details). This field is encoded using the following format: X1 Y1 T1;X2 Y2 T2;X3 Y3 T3;...Xi Yi are the coordinates of the click in normalized image coordinates. Ti is the click type, value 0 indicates the annotator marks the point as background, value 1 as part of the object instance (foreground). These clicks can be interesting for researchers in the field of interactive segmentation. They are not necessary for users interested in the final masks only.

Ok it’s the same name. Easy enough.

MaskPath,ImageID,LabelName,BoxID,BoxXMin,BoxXMax,BoxYMin,BoxYMax,PredictedIoU,Clicks
677c122b0eaa5d16_m04yx4_9a041d52.png,677c122b0eaa5d16,/m/04yx4,9a041d52,0.8875,0.960938,0.454167,0.720833,0.86864,0.95498 0.65197 1;0.89370 0.56579 1;0.94701 0.48968 0;0.91049 0.70010 1;0.93927 0.47160 1;0.90269 0.56068 0;0.92061 0.70749 0;0.92509 0.64628 0;0.92248 0.65188 1;0.93042 0.46071 1;0.93290 0.71142 1;0.94431 0.48783 0

We have our images downloaded…

Ok the masks folder is too big though. Let’s just do Chicken, ok? So we’ll delete any PNGs that don’t have m09b5t in their filename. And delete these zip files.

find . -type f -print0 | xargs --null grep -Z -L 'm09b5t' | xargs --null rm

Lol that deleted everything. Oops. Don’t do that. Ok download again…

We’ll process zip files one at a time.

 unzip train-masks-0.zip -d ./masks   (1 minute passes)
 cd masks
 find \! -name '*m09b5*png' -delete (30 seconds)
 mv * ../Chicken 

1…2….3…

OK unzipstuff.sh

I automated the process.

chicken@jetson:~/OID/v6$ cat unzipstuff.sh

#!/bin/bash
for i in 1 2 3 4 5 6 7 8 9 a b c d e f
do
  eval "unzip train-masks-$i.zip -d masks/"
  cd masks
  find ! -name 'm09b5png' -delete
  mv /home/chicken/OID/v6/masks/* /home/chicken/OID/v6/Chicken
  cd ..
done
I need to display the information somehow.  Jupyter Lab (Notebooks) are probably the best way to display code, and run it interactively.  


chicken@jetson:~$ jupyter notebook --generate-config
Writing default config to: /home/chicken/.jupyter/jupyter_notebook_config.py
chicken@jetson:~$ jupyter-lab

Ok so I wasn’t sure why I couldn’t connect to the server on the Jetson, but I’m able to run it at http://localhost:8888/ through an SSH tunnel.

ssh -L 8888:127.0.0.1:8888 chicken@192.168.101.109

I’m not sure what the difference between Lab and Notebook is, exactly, yet, either. But I think Notebook is a subset of Lab.

Ok so I’m trying to match JPGs and PNGs. Some interesting data, with multiple masks for some images, and no masks for some images.

I set up SAMBA to copy files over and investigate.

I see. The disturbing part is that no images in my test and validation folders matched any masks. But all of the train images had a match…

OH. train, validation and test ALL have their own 16 zip files of masks.

Good thing I automated that… ok so same thing, but changing ‘train’ to the ‘validation’ and ‘test’.

I did a programmatic test on the directories to see if any images were missing a mask:

for fname in os.listdir(test_images_dir):
   if len(glob.glob(test_masks_dir + "*" + fname[:-4] + "*")) == 0:     
      print(fname)

It’s looking better. Still some missing, but good enough now. Missing 6 validation masks, and 12 test masks. All training images have at least one mask

Number of Train images: 1122
Number of Train masks: 2237
Number of validation images: 44
Number of validation masks: 59
02a0f2858f27a7ba.jpg
01463f5494340d3d.jpg
00e71a70a2f669ff.jpg
05887f57bc232041.jpg
0d3da02e79f84dde.jpg
0ed7092c41c81d14.jpg
Number of test images: 154
Number of test masks: 186
0e9be8b09f71f909.jpg
0913fbf6fa5c190e.jpg
0f8a38312499d209.jpg
0650a130d7f707b5.jpg
0a8a5aa471796fd5.jpg
0cc4722ca906f86c.jpg
04423d3f6f5b8e74.jpg
03bc7fbc956b3a9a.jpg
07621394c8ad0b47.jpg
000411001ff7dd4f.jpg
0e5ecc56e464dcb8.jpg
05600e8a393e3c3a.jpg

I’ll move these ones out of the folder.

mkdir ~/backup
cd /home/chicken/OID/v6/images/Chicken/validation/
mv 02a0f2858f27a7ba.jpg ~/backup
mv 01463f5494340d3d.jpg ~/backup
mv 00e71a70a2f669ff.jpg ~/backup
mv 05887f57bc232041.jpg ~/backup
mv 0d3da02e79f84dde.jpg ~/backup
mv 0ed7092c41c81d14.jpg ~/backup
cd /home/chicken/OID/v6/images/Chicken/test/
mv 0e9be8b09f71f909.jpg ~/backup
mv 0913fbf6fa5c190e.jpg ~/backup
mv 0f8a38312499d209.jpg ~/backup
mv 0650a130d7f707b5.jpg ~/backup
mv 0a8a5aa471796fd5.jpg ~/backup
mv 0cc4722ca906f86c.jpg ~/backup
mv 04423d3f6f5b8e74.jpg ~/backup
mv 03bc7fbc956b3a9a.jpg ~/backup
mv 07621394c8ad0b47.jpg ~/backup
mv 000411001ff7dd4f.jpg ~/backup
mv 0e5ecc56e464dcb8.jpg ~/backup
mv 05600e8a393e3c3a.jpg ~/backup

Ok and now all the images have masks!

Number of Train images: 1122 
Number of Train masks: 2237 
Number of validation images: 38 
Number of validation masks: 59 
Number of test images: 142 
Number of test masks: 186

Momentous. Looking at the nicolas windt article, there might be some dead links. So let’s delete those images too.

find -size 0 -delete

Number of Train images: 982 
Number of Train masks: 2237 
Number of validation images: 32 
Number of validation masks: 59 
Number of test images: 130 
Number of test masks: 186

Oof, still good. Let’s load a picture in Jupyter. Ok tensorflow has a loadimage function.

No module named 'tensorflow'

Right. We tried installing it with Docker. How will that even work? Eish, gotta read up on this.

Back to Tensorflow.

Ok I already downloaded an NVIDIA-friendly tensorflow 3 weeks ago. Well, things move slowly, but all incremental gains move things forward. With experience you learn ways not to do things.

chicken@jetson:~/OID/v6/images$ sudo docker images
REPOSITORY TAG IMAGE ID CREATED SIZE
tensorflow/tensorflow 2.4.1-gpu-jupyter 64d8717296f8 3 weeks ago 5.71GB
dustynv/jetson-inference r32.5.0 ccc2a5f19dad 3 weeks ago 2.89GB
nvidia/cuda 11.0-base 2ec708416bb8 5 months ago 122MB

Ok the TF2 instructions say…

Start a GPU container, using the Python interpreter.

$ docker run -it --rm -v $(realpath ~/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-jupyter

Run a Jupyter notebook server with your own notebook directory (assumed here to be ~/notebooks). To use it, navigate to localhost:8888 in your browser. So…

$ docker run -it --rm -v ~/notebooks:/tf/notebooks -p 8888:8888 tensorflow/tensorflow:2.4.1-gpu-jupyter

Error...

standard_init_linux.go:211: exec user process caused "exec format error"

And pip?

chicken@jetson:~$ pip3 install tensorflow
Defaulting to user installation because normal site-packages is not writeable
ERROR: Could not find a version that satisfies the requirement tensorflow
ERROR: No matching distribution found for tensorflow

Great. Sanity check…

docker run -it --rm tensorflow/tensorflow bash

standard_init_linux.go:211: exec user process caused "exec format error"

Ok. Right, Jetson is aarch64, not x86-64… so google is suggesting Archiconda. This is too much for now. What’s wrong with pip? Python 3.6.9 is supposed to work with TF2.4.1 https://pypi.org/project/tensorflow/ hmm i guess there’s just no aarch64 version of TF2 precompiled.

So… one option is switch to PyTorch. Other option is try archiconda. I’m going to try this: https://ngc.nvidia.com/catalog/containers/nvidia:l4t-ml

“The Machine learning container contains TensorFlow, PyTorch, JupyterLab, and other popular ML and data science frameworks such as scikit-learn, scipy, and Pandas pre-installed in a Python 3.6 environment. Get started on your AI journey quickly on Jetson with everything pre-installed in this container.”

docker pull nvcr.io/nvidia/l4t-ml:r32.5.0-py3

sudo docker run -it –rm –runtime nvidia –network host -v /home/chicken/OID:/opt/OID -v /home/chicken/notebooks:/opt/notebooks nvcr.io/nvidia/l4t-ml:r32.5.0-py3

ok now we’re cooking. (No chickens were cooked during the making of this.)

So now I’m back on track, at like step 0.

I’m working off the Keras U-Net code now, from https://keras.io/examples/vision/oxford_pets_image_segmentation/ because it’s one of the simplest CNNs out there, from 2015. I’ve also opened up another implementation because it has more useful examples for training.

Note though that due to U-Net’s simplicity, it is often used for medical computer vision applications, since there’s not so much deep learning magic going on. You can quite easily imagine the latent representation dwelling somehow, at the bottom of the U shaped neural network. It should give us something interesting.

Let’s find the latent representation of a chicken.

We need to correlate the images and masks. We can glob by file name. Probably good as anything. But should probably put it in arrays of arrays or something. One image, many masks. So like a map from an image filename, to a list of mask filenames. As python calls maps, ‘dictionaries’.

Ok amazing, that works. I can see image and mask, and they correspond.

At some point I need to transform these. Make them all 256×256 pixels or something like that. Hmm.

OK, I got the training running. I got the Jetson like a month ago now, probably.

Had to reduce the batch size and epoch size, to get rid of an Out of Memory error. Then had a sort of browser freeze.

I should really run a script like this, instead:

nohup train.py &

but instead i’m hoping i can run it in Jupyter and it just follows the execution, and doesn’t freeze up. Maybe if I remove some debugging text…

But the loss function wasn’t going anywhere, even after 50 epochs, overnight. The mask prediction is just all black.

And I need to restart the Docker to open the tensorboard port

For Docker users: In case you are running a Docker image of Jupyter Notebook server using TensorFlow’s nightly, it is necessary to expose not only the notebook’s port, but the TensorBoard’s port. Thus, run the container with the following command:

docker run -it -p 8888:8888 -p 6006:6006 \
tensorflow/tensorflow:nightly-py3-jupyter 

or in my case,

sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/OID:/opt/OID -v /home/chicken/notebooks:/opt/notebooks nvcr.io/nvidia/l4t-ml:r32.5.0-py3


hmm the python 'magic' is not working

Ok so I ran tensorboard inside the docker terminal, instead of in the notebook. (You can do that by checking the container ID of 'docker ps' and calling 'docker exec -it <ID> bash')


python3 -m tensorboard.main --logdir=/opt/notebooks/logs



from tensorboard import notebook
import datetime

#%load_ext tensorboard
%reload_ext tensorboard
%tensorboard --logdir /opt/notebooks/logs

notebook.list()
notebook.display(port=6006, height=1000)



ok yeah so my ML model didn't learn shit.  
Also apparently they don't have tensorflow 2 in this nvidia ML docker container.

root@jetson:/opt/notebooks/logs# pip3 show tensorflow
Name: tensorflow
Version: 1.15.4+nv20.11

So how to debug? The images are converted to an n-dimensional array.


Got array with shape: (4, 256, 256, 1)

Ok things are going weird now, almost as I notice the TF version. It must be getting late.

Next day: Ok Nvidia has a TF2 docker, and it shares about half the layers with the other docker, so that’s cool: nvcr.io/nvidia/l4t-tensorflow:r32.5.0-tf2.3-py3

But it doesn’t have jupyter installed. Maybe I can copy the relevant bits from the Dockerfile. I’ve tried installing Jupyter and committing the docker, but “Failed building wheel for cffi”, some aarch64 issue.

RUN apt-get update && apt-get install -y libffi6 libffi-dev

Hard to find the nvidia docker files, and they only have l4t-base available.

#

# JupyterLab Dockerfile bits

#

RUN pip3 install jupyter jupyterlab --verbose

#RUN jupyter labextension install @jupyter-widgets/jupyterlab-manager@2

RUN jupyter lab --generate-config

RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')"


CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & echo "allow 10 sec for JupyterLab to start @ http://localhost:8888 (password nvidia)" && echo "JupterLab logging location:  /var/log/jupyter.log  (inside the container)" && /bin/bash
- from https://github.com/dusty-nv/jetson-containers/blob/master/Dockerfile.ml

ok sweet jeebus, after a big detour, i am using this successfully.

chicken@jetson:~$ cat Dockerfile

FROM docker.io/datamachines/jetsonnano-cuda_tensorflow_opencv:10.2_2.3_4.5.1-20210218
RUN pip3 install jupyter jupyterlab --verbose
RUN jupyter lab --generate-config
RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')"
EXPOSE 6006
EXPOSE 8888
CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \
echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \
echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && \
/bin/bash

chicken@jetson:~$ sudo docker build -t nx_setup .

chicken@jetson:~$ sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/:/dmc nx_setup

finally. So, back to tensorflow, and running U-Net!

So, maybe I see a problem with the semantic segmentation, possibly, which is related to chickens being a category among other things, rather than a binary chickeness and non-chickenness :

SparseCategoricalCrossentropy class

Use this crossentropy metric when there are two or more label classes. 

I only have one class. Chicken. So that won’t work. I need an egg dataset. Luckily this implementation has an example of an eye, and the veins, and that is why we want the U-Net, for the egg anomaly detection.

The problem’s symptom is that nothing is being learned during training. So maybe I’m using the wrong loss function.

I need to review instance segmentation “options”.

The loss function is currently measuring “the crossentropy metric between the labels and predictions.”

The reason I want instance segmentation is to differentiate between chickens, where possible. Panoptic segmentation actually makes the most sense for this project.

Panoptic segmentation uses a semantic network and an instance network, and uses them both, to deliver something like (“cat”,0), (“cat”,1), (“cat”,3)

COCO Panoptic API looks great, but it seems to need json to describe all of the PNG images. Bounding boxes seems unnecessary but COCO needs bounding boxes data.

We’ll start a new post on Panoptic Segmentation using COCO, and get back to Tensorflow 2 for U-Net, for semantic segmentation, when training on lit up eggs for in ovo sexing.

Update after a hiatus: I see a recent nnU-Net advancement… It’s a meta modelling process evolution thing. “self-configuring” for biomedical imaging. Hmm. Very interesting.

We’re not there yet. We just want to get a basic U-Net working.

I see too, Perceptilabs from W&B is released and they have some beautiful screenshots too, though not available on pip3 yet for aarch64. So it’s not an option at the moment.

So, for reminder, in this post, we’re trying to get basic U-Net segmentation working. Here’s a good explanation of it.

“Back to U-net”

I’ve found another implementation of U-Net that seems a bit more plug and play. There is also a useful note here regarding U-Net and the number of classes. https://github.com/karolzak/keras-unet/issues/3

(173, 512, 512, 3) (173, 512, 512)
vs
(30, 512, 512) (30, 512, 512)

One of their notebooks looks like a promising notebook, the kz-isbi-challenge.py, and I rigged it to run on my data, and I get OOM. Out of Memory. But this is jupyter lab. Let’s not train it in jupyter lab. Seems like a bad idea. Like a common problem that there’s probably a solution to, but where the solution is probably, ‘use python, dumbass’ So, converted to py, and edited. Had to take out all the plotting code. Pity. But same problem.

I found a jetson-stats https://github.com/rbonghi/jetson_stats jtop program and though it only showed 6.2GB/8GB of RAM the whole time, (I wasn’t even using up all the RAM?), it did remind me that i’m in a Docker, and maybe I’m not using swap space, and that 8GB is probably not enough RAM for a conv net. The U-Net had 31 million params.

Trainable params: 31,030,593




ResourceExhaustedError:  OOM when allocating tensor with shape[32,128,256,256] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
  [[node functional_1/concatenate_3/concat (defined at <ipython-input-26-51303ee95255>:7) ]]
Hint: If you want to see a list of allocated tensors when OOM happens, add report_tensor_allocations_upon_oom to RunOptions for current allocation info.
 [Op:__inference_test_function_3292]

Hmm. Well, about the docker swap space, docker will use the resources it can, on the host, which is gonna be just a bit less than whatever the host can handle. So when it crashed, It appears to me that it’s trying to load gpu memory, and only has 400MB or so.

2021-06-07 19:06:35.653219: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] total_region_allocated_bytes_: 404856832 memory_limit_: 404856832 available bytes: 0 curr_region_allocation_bytes_: 809713664
 2021-06-07 19:06:35.653456: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Stats: 
 Limit:                       404856832
 InUse:                       395858688
 MaxInUse:                    404771584
 NumAllocs:                         540
 MaxAllocSize:                 69172736
 Reserved:                            0
 PeakReserved:                        0
 LargestFreeBlock:                    0

So that was the advice from the repo author, that you should check your threads to see if they’ve allocated memory already, leaving none for other processes. (top or ps -ef) to see processes running.

After killing jupyter, I left it training overnight, on 300 training images and masks, from our chicken dataset, and it ran out of memory. But it looks like it finished training before it crapped out, and this time, the Out of Memory (OOM) error had some bigger numbers.

2021-06-08 08:15:21.038084: I tensorflow/core/common_runtime/bfc_allocator.cc:1040] total_region_allocated_bytes_: 1400856576 memory_limit_: 1400856576 available bytes: 0 curr_region_allocation_bytes_: 2801713152
2021-06-08 08:15:21.038151: I tensorflow/core/common_runtime/bfc_allocator.cc:1046] Stats:
Limit: 1400856576
InUse: 616462592
MaxInUse: 1400851712
NumAllocs: 37528
MaxAllocSize: 1280887296
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0


And you can see the loss was decreasing.  That's cool.
So that third, ghostly column, is the one we're watching.  I think it's just not very good yet.  But maybe I don't understand what it's doing, exactly, either. I am expecting that when I'm done here, it should be able to make the mask, from just the image. 

The loss functions I’ve used have been,

model.compile(
     optimizer=Adam(), 
     loss='binary_crossentropy',
     metrics=[iou, iou_thresholded]
 )

and

model.compile(
     optimizer=SGD(lr=0.01, momentum=0.99),
     loss=jaccard_distance,
     metrics=[iou, iou_thresholded]
 )

So that was training with the second one, last night. I will continue with it for now. Jaccard distance is, union minus intersection, over union. Sounds good to me. Optimising, using Stochastic Gradient Descent, with some hyperparameters.

 d_J(A,B) = 1 - J(A,B) = { { |A \cup B| - |A \cap B| } \over |A \cup B| }.

Let’s leave it training again. I’m also upping the ratio between training and validation data, from 50/50 to 80/20. why not.

Also, the code we had before, for the first U-Net attempt, in the ‘Chicken Vision.py’ notebook, seemed more memory efficient, because it was lazy loading the images. But maybe much of a muchness. We’ll see, perhaps.

So training isn’t working anymore, it seems.

W tensorflow/core/kernels/gpu_utils.cc:49] 
Failed to allocate memory for convolution redzone checking; skipping this check. This is benign and only means that we won't check cudnn for out-of-bounds reads and writes. This message will only be printed once.

Followed by OOM. Benign.

Stats: 
 Limit:                      1403920384
 InUse:                       650411520
 MaxInUse:                   1403915520
 NumAllocs:                       37625
 MaxAllocSize:               1266649600
 Reserved:                            0
 PeakReserved:                        0
 LargestFreeBlock:                    0

Ok we might need a cloud gpu. Jetson NX not cutting it.

From a while later, after cloud gpus, it is worth noting that there is a weed detection U-Net using two different loss functions, Dice loss, and ‘Focal Tversky loss’, and only has a 19,667 parameter NN. That’s orders of magnitude smaller, so I might want to come back and see how.

Categories
AI/ML CNNs dev Vision

Panoptic segmentation

arxiv paper: https://arxiv.org/pdf/1801.00868.pdf

Panoptic segmentation picks an instance segmentation algorithm and a semantic segmentation algorithm.

Some notable papers are listed here, with the benchmarks of the best related githubs, https://paperswithcode.com/task/panoptic-segmentation

For example,

  • MASK_RCNN algorithm for instance segmentation
  • DeepLabV2 Algorithm for semantic segmentation

The architecture of the neural network has two pyramids, one for semantics (classes), and one to count the instances

After much circular investigation, i arrived at the notion that transfer learning from a pre-trained network, with the ‘fine tuning’ referring to adding a new class, is the way to go.

But we’re still suffering from not having found an example using PNG mask files. I can convert to COCO, and that might be what I do yet, because, like the dataset had their own Panoptic segmentation challenge and format. https://cocodataset.org/#panoptic-eval They seem to be winning this race. We’ll do COCO.

It will mostly involve writing or exporting info into json format, and following some terse, ambiguous instructions.

Another thing is that COCO wants bounding boxes too. So this will be an exercise in config generation to satisfy the COCO format requirements. I have the data from Open images, but COCO looks like the biggest game in town.

Then for algorithm, there’s numerous Pytorch libraries, especially a very relevant one, YOLACT Edge, using a ‘Darknet’ architecture, which is an old “Open Source Neural Networks in C”

Hmm. It’s more instance segmentation than panoptic, but looks like a good compromise, to aim for.

https://github.com/haotian-liu/yolact_edge – It uses bounding boxes, so what will I do with all these chicken masks?

YOLACTEdge arxiv paper

Otherwise, the tensorflow object detection tutorials are here:

https://github.com/tensorflow/models/tree/master/research/object_detection/colab_tutorials

The eager_few_shot_od_training_tflite.ipynb notebook also looks like a winner for showing how to add a new Duck class to a MobileNet architecture. YOLACT Edge has a MobileNet model available too.

I am sitting with a thousand or so JPGs of chickens with corresponding PNG masks, sorted into train/val/test datasets. I was hoping for the Keras UNet segmentation demo to work because I initially thought UNet will be ideal for the egg light camera, but now I’m back to the FAIR detectron2 woods, to find a panoptic segmentation solution.

Let’s try the YOLACT Edge one, because it’s based on YOLO, (You only look once), a single shot object detector algorithm, but which is also more commonly known for ‘You only live once’, an affirmation of often reckless behaviour. YOLACT stands for You Only Look At CoefficienTs. In this case it looks like the state of the art, and it’s been used on the Jetson before, which is promising. At 30 frames per second on the Jetson AGX, we’ll probably be getting 20 or so on the Jetson NX. Since that’s using Torch to TensorRT to speed it up, it seems like we should try it. I was initially averse to using NVIDIA specific software, but we should make the most of this hardware (if we can)

It’s not really panoptic segmentation. But it’s looking Good Enough™ like what we need, rather than what we thought we wanted.

Let’s try these instructions:

https://github.com/haotian-liu/yolact_edge/blob/master/INSTALL.md

We’ll try it on the NX. “Inside” the Docker. What’s our CUDA version?

nvcc --version

10.2

TensorRT should already be installed.

(On Nano, if nvcc not found, check out this link )

git clone https://github.com/NVIDIA-AI-IOT/torch2trt
cd torch2trt
sudo python3 setup.py install --plugins

Here’s from the COCO panoptic readme.

https://cocodataset.org/#format-data RELEVANT EXCERPT FOR….

Panoptic Segmentation

For the panoptic task, each annotation struct is a per-image annotation rather than a per-object annotation. Each per-image annotation has two parts: (1) a PNG that stores the class-agnostic image segmentation and (2) a JSON struct that stores the semantic information for each image segment. In more detail:

  1. To match an annotation with an image, use the image_id field (that is annotation.image_id==image.id).
  2. For each annotation, per-pixel segment ids are stored as a single PNG at annotation.file_name. The PNGs are in a folder with the same name as the JSON, i.e., annotations/name/ for annotations/name.json. Each segment (whether it’s a stuff or thing segment) is assigned a unique id. Unlabeled pixels (void) are assigned a value of 0. Note that when you load the PNG as an RGB image, you will need to compute the ids via ids=R+G*256+B*256^2.
  3. For each annotation, per-segment info is stored in annotation.segments_info. segment_info.id stores the unique id of the segment and is used to retrieve the corresponding mask from the PNG (ids==segment_info.id). category_id gives the semantic category and iscrowd indicates the segment encompasses a group of objects (relevant for thing categories only). The bbox and area fields provide additional info about the segment.
  4. The COCO panoptic task has the same thing categories as the detection task, whereas the stuff categories differ from those in the stuff task (for details see the panoptic evaluation page). Finally, each category struct has two additional fields: isthing that distinguishes stuff and thing categories and color that is useful for consistent visualization.
annotation{

"image_id": int, 
"file_name": str, 
"segments_info": [segment_info],}


segment_info{
             "id": int, 
             "category_id": int, 
             "area": int, 
             "bbox": [x,y,width,height], 
             "iscrowd": 0 or 1,}

categories[{"id": int, 
            "name": str,  
            "supercategory": str, 
            "isthing": 0 or 1, 
            "color": [R,G,B],}]

Ok, we can do this.

Right, so, if anything, we want to transfer learn from a trained neural network. There’s some interesting discussion about implementing your own transfer learning of a coco dataset, in keras-retinanet here, but we’re looking at using Yolact Edge, based on pytorch, so let’s not get distracted. We need to create the COCO dataset. I’ve put this off for so long.

We need the COCO categories that are already trained, and I see there is the 2018 api https://github.com/cocodataset/panopticapi which has the Panoptic challenge coco categories (panoptic_coco_categories.json) and ah ha this is what I have been searching for.

panopticapi/sample_data/panoptic_examples.json

After pretty printing with

python3 -m json.tool panoptic_examples.json

here’s the example, for this bit.

"images": [
{
"license": 2,
"file_name": "000000142238.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000142238.jpg",
"height": 427,
"width": 640,
"date_captured": "2013-11-20 16:47:35",
"flickr_url": "http://farm5.staticflickr.com/4028/5079131149_dde584ed79_z.jpg",
"id": 142238
},
{
"license": 1,
"file_name": "000000439180.jpg",
"coco_url": "http://images.cocodataset.org/val2017/000000439180.jpg",
"height": 360,
"width": 640,
"date_captured": "2013-11-19 01:25:39",
"flickr_url": "http://farm3.staticflickr.com/2831/9275116980_1d9b986e3b_z.jpg",
"id": 439180
}
]

and we’ve got some images

./input_images/000000439180.jpg
./input_images/000000142238.jpg

and their masks.

./panoptic_examples/000000439180.png
./panoptic_examples/000000142238.png

Ah here’s ‘bird’ category.

{
"supercategory": "animal",
"color": [
 165,
 42,
 42
],
"isthing": 1,
"id": 16,
"name": "bird"
},

“Let’s try get some visualisation working”

Ok hold on though. Let’s try get some visualisation working, before anything else. This looks like the ticket. But it is a python file, and running matplotlib, so ideally we’d transform this to a Jupyter Notebook. Ok, just New Notebook, copy paste. Run.

ModuleNotFoundError: No module named 'skimage'



[Big Detour and to the rescue, Datamachines]

Ok we can install it with !pip3 install scikit-image ? No, that fails… what did I do, right, I need to ssh into the Jetson,

chrx@chrx:~$ ssh -L 8888:127.0.0.1:8888 -L 6006:127.0.0.1:6006 chicken@192.168.101.109

Then find the docker ID, and docker exec -it 519ed46162ae bash into it, and goddamnit what, UnicodeDecodeError: ‘ascii’ codec can’t decode byte 0xc3 in position 4029: ordinal not in range(128)

Ok so someone’s already had this happen, and it’s because the locale preferred encoding, needs to be UTF-8. But it’s some obscrure ANSI.

root@jetson:/# python -c 'import locale; print(locale.getpreferredencoding())'
ANSI_X3.4-1968

Someone posted a bunch of steps for the L4T docker folks. That would be us. Do we really need this library ?

It’s to get this function.

from skimage.segmentation import find_boundaries

Yes, ok, it is quite hellish to install skimage. This was how to do it in debian, for skimage up to v. 0.13.1-2

apt install python3-skimage

But on it gets “ImportError: cannot import name ‘_validate_lengths'” which is resolved in 1.14.2

I’ve asked on the forum, and am hoping NVIDIA can solve this one. The skimage docs say:

  1. Linux on 64-bit ARM processors (Nvidia Jetson):

As per the latest comment, (only 3 weeks ago, others were on the trail of similar tasks!), mmartial is pointing to datamachines, which has some Dockerfiles for building OpenCV and Tensorflow, and YOLOv4.

Ok, let’s try what the instructions suggest:

make tensorflow_opencv to build all the tensorflow_opencv container images”

I’ll try the CuDNN version next if this doesn’t work…

Ok…we’re on step 16 of 42… Ooh Python 3.8, that’s an upgrade. Build those wheels, pip3! Doh, Step 24 of 42.

bazel: Exec format error

The command returned a non-zero code: 2

*Whomp whomp* sound

ok let’s try

make cudnn_tensorflow_opencv, no…

I asked on the Issues, and they noticed those are the amd64 builds, not the aarch64 build. I could use their DockerHub pre-build for now.

so after a detour, i am using this Dockerfile successfully to run Jupyter on the NX. We got stuck because skimage was difficult to install, and now we’re back on track, annotating the COCO, and so on.

chicken@jetson:~$ cat Dockerfile

FROM docker.io/datamachines/jetsonnano-cuda_tensorflow_opencv:10.2_2.3_4.5.1-20210218
RUN pip3 install jupyter jupyterlab --verbose
RUN jupyter lab --generate-config
RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')"
EXPOSE 6006
EXPOSE 8888
CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \
echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \
echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && \
/bin/bash

chicken@jetson:~$ sudo docker build -t nx_setup .

chicken@jetson:~$ sudo docker run -it -p 8888:8888 -p 6006:6006 --rm --runtime nvidia --network host -v /home/chicken/:/dmc nx_setup

So, where were we?

Right. Panoptic API, we wanted to run visualize.py, first, so we could check progress. But it needed skimage installed. Haha. Ok, one week later… let’s try see the example.

Phew, ok. Getting back on track. So now we want to train it on the chickens.

So, COCO.

“Back to COCO”

As someone teaching myself about this, I know what I ideally want is to transfer learn from a trained network. But it isn’t obvious how. I apparently need to chop off the last layer of a trained network, freeze most of the network, and then retrain the last bit.

Well, back to this soon…

So,

Here we have a suggestion from dbolya author of YOLACT and YOLACT++, the original.


try: self.load_state_dict(state_dict) except RuntimeError as e: print('Ignoring "' + str(e) + '"')
and then resume training from yolact_im700_54_80000.pth:
python train.py --config=<your_config> --resume=weights/yolact_im700_54_800000.pth --start_iter=0
When there are size mismatches between tensors, Pytorch will spit out an error message but also keep on loading the rest of the tensors anyway. So here we just attempt to load a checkpoint with the wrong number of classes, eat the errors the Pytorch complains about, and then start training from iteration 0 with just those couple of tensors being untrained. You should see only the C (class) and S (semantic segmentation) losses reset.
You probably also want to modify the learning rate, decay schedule, and number of iterations in your config to account for fine-tuning.

And an allusion to an example of its use, perhaps. And more clues about how to fine tune the ‘network head’.

You can do this by following the fine tuning procedure (#36) and then here:

yolact/yolact.py

Line 628 in f54b0a5

p = pred_layer(pred_x)

replace that with

p = pred_layer(pred_x)
p = pred_layer(pred_x.detach())

Ok… so here’s the YOLACT diagram:

A command to run the training.

Categories
AI/ML Vision

Confusion Matrix

Came across this metric for keeping track of:

True Positives

False Positives

True Negatives

False Negatives.

Might be useful for judging supervised learning of vision apps.

Confusion Matrix : https://en.wikipedia.org/wiki/Confusion_matrix

Categories
AI/ML envs Vision

NVIDIA Tests

Ok here we go. I already installed gstreamer. Was still getting some h264 encoding plugin missing error, so I needed to add this:

apt-get install gstreamer1.0-libav

Then on my Ubuntu laptop (192.168.0.103), I run:

gst-launch-1.0 -v udpsrc port=1234 caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" ! rtph264depay ! decodebin ! videoconvert ! autovideosink

And then to get webcam streaming from the Jetson,

video-viewer /dev/video0 rtp://192.168.0.103:1234

and similarly,

detectnet /dev/video0 rtp://192.168.0.103:1234

and

segnet /dev/video0 rtp://192.168.0.103:1234

It is impractical to run VNC because of the tiny resolution, and ssh -X tunnelling because that requires the host to have whatever drivers are used on the jetson. GStreamer is working well though.

Cool.

I’m trying to run a python program on the feed. Ended up finding same issue elsewhere. RTP output not working. Bumped the thread.

Someone worked it out. Needed a do/while loop instead of a for loop.

It’s been a couple of months since I was here, and it’s now time to load up this gstreamer code again, and see if I can stream a webcam feed from the robot, and we evaluate inference on the trained CNN, and we colour in the different classes in pretty colours, and

So, TODO:

  1. Get panoptic segmentation colour mapping combining the output layers of the CNN. So something like this. We get a live webcam feed, and we run the frames through the semantic segmentation CNN, and combine the binary masks into a multiclass mask.
  2. Then gstream these rainbow multiclass masked images to a listening gstreamer video-viewer program

So, practically though, I need to set up the Jetson again, and get it compiling the trained h5 file, and using webcam stills as input. Then gstream the result. Ok.

And fix power cable for Jetson step 1. Then try set up webcam gstreaming.

Ok, it’s plugged in, nmap found it. Let’s log in… Ok, run gstreamer, and then i’ve got a folder, jetson-inference, and I’m volume mapping it.

./docker/run.sh --volume /home/chicken/jetson-inference:/jetson-inference
import jetson.inference
import jetson.utils

net = jetson.inference.detectNet("ssd-mobilenet-v2", threshold=0.5)

camera = jetson.utils.videoSource("/dev/video0")
display = jetson.utils.videoOutput("rtp://192.168.101.127:1234","--headless") # 'my_video.mp4' for file

while True:
    img = camera.Capture()
    detections = net.Detect(img)
    display.Render(img)
    display.SetStatus("Object Detection | Network {:.0f} FPS".format(net.GetNetworkFPS()))
    if not camera.IsStreaming() or not display.IsStreaming():
        break

I am in this dusty jetson-inference docker.

pip3 list

appdirs (1.4.4)
boto3 (1.16.58)
botocore (1.19.58)
Cython (0.29.21)
dataclasses (0.8)
decorator (4.4.2)
future (0.18.2)
jmespath (0.10.0)
Mako (1.1.3)
MarkupSafe (1.1.1)
numpy (1.19.4)
pandas (1.1.5)
Pillow (8.0.1)
pip (9.0.1)
pycuda (2020.1)
python-dateutil (2.8.1)
pytools (2020.4.3)
pytz (2020.5)
s3transfer (0.3.4)
setuptools (51.0.0)
six (1.15.0)
torch (1.6.0)
torchaudio (0.6.0a0+f17ae39)
torchvision (0.7.0a0+78ed10c)
urllib3 (1.26.2)
wheel (0.36.1)

and my nice docker with TF2 and everything installed already says:

root@jetson:/dmc/jetson-inference/build/aarch64/bin# ./my-detection.py
Traceback (most recent call last):
File "./my-detection.py", line 24, in
import jetson.inference
ModuleNotFoundError: No module named 'jetson'

Ok let’s try install jetson from source. First, tried the ‘Quick Reference’ instructions… Errored at

/dmc/jetson-inference/c/depthNet.h(190): error: identifier "COLORMAP_VIRIDIS_INVERTED" is undefined

/dmc/jetson-inference/c/depthNet.h(180): error: identifier "COLORMAP_VIRIDIS_INVERTED" is undefined

Next, ran a command mentioned lower down,

git submodule update --init

Now make -j$(nproc) gets to

/usr/bin/ld: cannot find -lnvcaffe_parser

And this reply suggests using sed to fix this…

sed -i ‘s/nvcaffe_parser/nvparsers/g’ CMakeLists.txt 

Ok…Built! And…

[gstreamer] gstCamera -- attempting to create device v4l2:///dev/video0
[gstreamer] gstCamera -- didn't discover any v4l2 devices
[gstreamer] gstCamera -- device discovery and auto-negotiation failed
[gstreamer] gstCamera -- failed to create device v4l2:///dev/video0
Traceback (most recent call last):
File "my-detection.py", line 31, in
camera = jetson.utils.videoSource("/dev/video0")
Exception: jetson.utils -- failed to create videoSource device

Ok so docker/run.sh also has some other stuff going on, looking for V4L2 devices and such. Ok added device, and it’s working in my nice docker!

sudo docker run -it -p 8888:8888 -p 6006:6006 -p 8265:8265 --rm --runtime nvidia --network host --device /dev/video0 -v /home/chicken/:/dmc nx_setup

So what now… We open ‘/dev/video0’ for V4L2, then replace my inference code below.

img = camera.Capture()     
detections = net.Detect(img)     
display.Render(img)


Ok ... zoom ahead and it's all worked out...
---
After going on the ringer another time: zooming ahead, fixing that jetson issue, involves 
cd jetson-inference/build
make install
ldconfig


---

Saved the sender program at the github 

To get video to the screen

gst-launch-1.0 -v udpsrc port=1234  caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" !  rtph264depay ! decodebin ! videoconvert ! autovideosink

or to a mp4

gst-launch-1.0 -v udpsrc port=1234  caps = "application/x-rtp, media=(string)video, clock-rate=(int)90000, encoding-name=(string)H264, payload=(int)96" !  rtph264depay  ! decodebin ! x264enc ! qtmux ! filesink location=test.mp4 -e
Categories
Hardware hardware_ Locomotion robots

Rotary Encoders and motors

We should check out rotary encoders and motors, while I’m in RSA. I’ve started looking at them now. There’s a wealth of info at a company that makes them, Dynapar.

Here’s the good stuff. Coupling: “Rotary encoders come in 3 major mounting styles: hollow-shaft (hollow-bore or through shaft), hub-shaft (hub-bore) or shafted. Hollow-shaft and hub-shaft rotary encoders mount directly to a motor shaft typically using a tether. Shafted rotary encoders mount using a flexible coupling.”

I asked a South African company about their Italian ‘Eltra‘ rotary encoders yesterday. Then Feetech from China asked this morning, if I needed servos. Well actually yes. I asked about their encoder motors, as they might be called, as a combination. They had 4.8V, 6V, and 7.4V such motors, whose encoders were of the magnetic measuring type, and give 12 bits of precision (2^12 = 4096, so 0 to 4095) as serial data.

The MG996Rs used for the basic table robot prototype were possibly adequate, but definitely on the hobby side. They felt more appropriate for a robot about half the size of the prototype. The prototype was successfully sat upon, by the chicken.

I started thinking about practicalities of powering a jetson nx (The devkit supports between 9-20V ) or jetson nano (5V) ot rpi (5V). and numerous motors. So probably uses a battery in the 9-20V range. Motor too. Probably 14.4V is what Lithium ion or similar batteries come as. Usually need to buy lithium ion batteries in the place where you’re going, rather than bring them on a plane. But you can always try. (Not a financial advisor (or otherwise)). They are probably still mostly imported for RC models (remote control, remember?).

But also wow, on the topic of rotary encoders and servos, this guy made a sub-millimeter accuracy by a 6 d.o.f. (degrees of freedom) robot arm

Used his own interesting and as yet mysterious encoder, which he put inside a motor. Says his robot arm cost EU 300. It’s super impressive.

The other option with power appears to be cool ‘just what you needed’ type products like DFRobot’s FIT0186, which has a built-in encoder! I ended up buying 4 x FIT0186s and 4 x FIT0185s (12V, 251RPM and 83RPM respectively).

I used an Arduino Nano, a DFRobot Dual motor driver, based on the TB6612 chip, plugging in the Nano for 5V, and 12V into the driver, to power the motors.

But after much wiring up, the motors work fine, but the encoders do not. I fixed up their example code to use volatile variables, and run the minimal amount of code in the interrupt service routines (ISRs), and even used two interrupts for a single motor, one for each hall sensor output, and put a 0.1mF capacitor between signal and ground. Tried CHANGE/RISING/FALLING triggers. Tried different motors.

I’ve looked at every forum post, and tried everything, and now I’ve posted on DFRobot’s forum, in case someone has got the encoder working. Not looking promising.

So back to the MG996Rs, I find that I ordered two batches from Mantech, supposedly the same genuine product from Tower Pro, and yet the one batch does not work properly with the PCA9685, uses a lower minimum pulse, and moves in the opposite direction to the other ones when using the same code. The duds keep creeping around, after you turn off the throttle.

Ok so I’m going to try with the one batch of MG996Rs that do seem to work, but we’ll need to make a new plan. Possibly, rigging an actual rotary encoder up, on the motor shaft. But for now, back to MG996Rs and the drawing board. We left the old robotable in Switzerland last year. So I made a new one with the remaining MG996Rs.

So I was able to get them fairly well calibrated by setting the min and max pulse, as that seems to be how servos work. They have a center PWM pulse, around 1500, where they are still, and then as you decrease the pulses per second, it goes one way, and as you increase the pulses per second, it goes the other way.

Unfortunately after much fuss, these continuous rotation servos turned out to be duds. At least for the way I was trying to use them.

Set the throttle to 1, sleep for a second. Set the throttle to zero, sleep for a second. Set the throttle to -1, sleep for a second. …And you have a new position. Unfortunately the positional control is unusable. I guess these are for wheels or an application where position is not so important.

Categories
Hardware hardware_ Locomotion sim2real

Finding where we left off

Months have passed. It is 29 December 2020 now, and Miranda and I are at Bitwäscherei, in Zurich, during the remoteC3 (Chaos Computer Club)

We have a few things to get back to:

  • Locomotion
    • Sim2Real
  • Vision
    • Object detection / segmentation

We plan to work on a chicken cam, but are waiting for the raspberry pi camera to arrive. So first, locomotion.

Miranda would like to add more leg segments, but process-wise, we need the software and hardware working together with basics, before adding more parts.

Sentient Table prototype

I’ve tested the GPIO, and it’s working, after adjusting the minimum pulse value for the MG996R servo.

So, where did we leave off, before KonS MFRU / ICAF ?

Locomotion:

There was a walking table in simulation, and the progress was saving so that we could reload the state, in Ray and Tune.

I remember the best walker had a ‘reward’ of 902, so I searched for 902

grep -R ‘episode_reward_mean\”\: 902’

And found these files:

 409982 Aug 1 07:52 events.out.tfevents.1596233543.chrx
334 Jul 31 22:12 params.json
304 Jul 31 22:12 params.pkl
132621 Aug 1 07:52 progress.csv
1542332 Aug 1 07:52 result.json

and there are checkpoint directories, with binary files.

So what are these files? How do I extract actions?

Well it looks like this info keeps track of Ray/Tune progress. If we want logs, we seem to need to make them ourselves. The original minitaur code used google protobuf to log state. So I set the parameter to log to a directory.

log_path="/media/chrx/0FEC49A4317DA4DA/logs"

So now when I run it again, it makes a file in the format below:


message RobotableEpisode {
// The state-action pair at each step of the log.
repeated RobotableStateAction state_action = 1;
}

message RobotableMotorState {
// The current angle of the motor.
double angle = 1;
// The current velocity of the motor.
double velocity = 2;
// The current torque exerted at this motor.
double torque = 3;
// The action directed to this motor. The action is the desired motor angle.
double action = 4;
}

message RobotableStateAction {
// Whether the state/action information is valid. It is always true if the
// proto is from simulation. It might be false when communication error
// happens on robotable hardware.
bool info_valid = 6;
// The time stamp of this step. It is computed since the reset of the
// environment.
google.protobuf.Timestamp time = 1;
// The position of the base of the minitaur.
robotics.messages.Vector3d base_position = 2;
// The orientation of the base of the minitaur. It is represented as (roll,
// pitch, yaw).
robotics.messages.Vector3d base_orientation = 3;
// The angular velocity of the base of the minitaur. It is the time derivative
// of (roll, pitch, yaw).
robotics.messages.Vector3d base_angular_vel = 4;
// The motor states (angle, velocity, torque, action) of eight motors.
repeated RobotableMotorState motor_states = 5;
}

I’m pretty much only interested in that last line,

repeated RobotableMotorState motor_states = 5;

So that’s the task, to decode the protobuf objects.

import os
import inspect

currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)

import argparse
from gym_robotable.envs import logging

if __name__ == "__main__":

    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/logs/robotable_log_2020-12-29-191602')
    args = parser.parse_args()
    logging = logging.RobotableLogging()
    episode = logging.restore_episode(args.log_file)
    print(dir (episode))
    print("episode=",episode)
    fields = episode.ListFields()

    for field in fields:
        print(field)

This prints out some json-like info. On the right path.

time {
seconds: 5
}
base_position {
x: 0.000978083148860855
y: 1.7430418253385236
z: -0.0007063670972221042
}
base_orientation {
x: -0.026604138524100478
y: 0.00973575985636693
z: -0.08143286338936992
}
base_angular_vel {
x: 0.172553297157456
y: -0.011541306494121786
z: -0.010542314686643973
}
motor_states {
angle: -0.12088901254000686
velocity: -0.868766524998517
torque: 3.3721667267908284
action: 3.6539504528045654
}
motor_states {
angle: 0.04232669165311699
velocity: 1.5488756496627718
torque: 3.4934419908437704
action: 1.4116498231887817
}
motor_states {
angle: 0.8409251448232009
velocity: -1.617737108768752
torque: -3.3539541961507124
action: -3.7024881839752197
}
motor_states {
angle: 0.13926660037454777
velocity: -0.9575437158301312
torque: 3.563701581854714
action: 1.104300618171692
}
info_valid: true
])

We will have to experiment to work out how to translate this data to something usable. The servos are controlled with throttle, from -1 to 1.

Technically I should probably rewrite the simulation to output this “throttle” value. But let’s work with what we have, for now.

My first attempt will be to extract and normalize the torque values to get a sequence of actions… nope.

Ok so plan B. Some visualisation of the values should help. I have it outputting the JSON now.

episode_proto = logging.restore_episode(args.log_file)

jsonObj = MessageToJson(episode_proto)

print(jsonObj)

I decided to use StreamLit, which is integrated with various plotting libraries. After looking at the different plotting options, Plotly seems the most advanced.

Ok so in JSON,

{
  "time": "1970-01-01T00:00:05Z",
  "basePosition": {
    "x": 0.000978083148860855,
    "y": 1.7430418253385236,
    "z": -0.0007063670972221042
  },
  "baseOrientation": {
    "x": -0.026604138524100478,
    "y": 0.00973575985636693,
    "z": -0.08143286338936992
  },
  "baseAngularVel": {
    "x": 0.172553297157456,
    "y": -0.011541306494121786,
    "z": -0.010542314686643973
  },
  "motorStates": [
    {
      "angle": -0.12088901254000686,
      "velocity": -0.868766524998517,
      "torque": 3.3721667267908284,
      "action": 3.6539504528045654
    },
    {
      "angle": 0.04232669165311699,
      "velocity": 1.5488756496627718,
      "torque": 3.4934419908437704,
      "action": 1.4116498231887817
    },
    {
      "angle": 0.8409251448232009,
      "velocity": -1.617737108768752,
      "torque": -3.3539541961507124,
      "action": -3.7024881839752197
    },
    {
      "angle": 0.13926660037454777,
      "velocity": -0.9575437158301312,
      "torque": 3.563701581854714,
      "action": 1.104300618171692
    }
  ],
  "infoValid": true
}

Plotly uses Panda dataframes, which is tabular data. 2 dimensions. So I need to transform this to something usable.

Something like time on the x-axis

and angle / velocity / torque / action on the y axis.

Ok so how to do this…?

Well I’ve almost got it, but I mostly had to give up on StreamLit’s native line_chart for now. Plotly’s has line chart code that can handle multiple variables. So I’m getting sidetracked by this bug:

When I import plotly’s library,

import plotly.graph_objects as go

“No module named ‘plotly.graph_objects’; ‘plotly’ is not a package”

https://stackoverflow.com/questions/57105747/modulenotfounderror-no-module-named-plotly-graph-objects

import plotly.graph_objects as go ? no…

from plotly import graph_objs as go ? no…

from plotly import graph_objects as go ? no…

hmm.

pip3 install plotly==4.14.1 ?

Requirement already satisfied: plotly==4.14.1 in /usr/local/lib/python3.6/dist-packages (4.14.1)

no… why are the docs wrong then?

Ah ha.

“This is the well known name shadowing trap.” – stackoverflow

I named my file plotly.py – that is the issue.

So, ok run it again… (streamlit run plot.py) and open localhost:8501…

Now,

Running as root without --no-sandbox is not supported. See https://crbug.com/638180

Ah ha. I went back to StreamLit notation and it worked.

#fig.show()
st.plotly_chart(fig)

Ok excellent, so here is my first round of code:

import pandas as pd
import numpy as np
import streamlit as st
import time
from plotly import graph_objects as go
import os
import inspect
from google.protobuf.json_format import MessageToJson
import argparse
from gym_robotable.envs import logging


currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
parentdir = os.path.dirname(os.path.dirname(currentdir))
os.sys.path.insert(0, parentdir)


if __name__ == "__main__":


    st.title('Analyticz')


    parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
    parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/logs/robotable_log_2020-12-29-191602')
    args = parser.parse_args()
    logging = logging.RobotableLogging()
    episode_proto = logging.restore_episode(args.log_file)


    times = []
    angles = [[]]*4           # < bugs!
    velocities = [[]]*4
    torques = [[]]*4
    actions = [[]]*4

    for step in range(len(episode_proto.state_action)):

       step_log = episode_proto.state_action[step]
       times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos))

       for i in range(4):
           angles[i].append(step_log.motor_states[i].angle)
           velocities[i].append(step_log.motor_states[i].velocity)
           torques[i].append(step_log.motor_states[i].torque)
           actions[i].append(step_log.motor_states[i].action)
 
    print(angles)
    print(times)
    print(len(angles))
    print(len(velocities))
    print(len(torques))
    print(len(actions))
    print(len(times))

    # Create traces
    fig = go.Figure()
    fig.add_trace(go.Scatter(x=times, y=angles[0],
	            mode='lines',
	            name='Angles'))
    fig.add_trace(go.Scatter(x=times, y=velocities[0],
	            mode='lines+markers',
	            name='Velocities'))
    fig.add_trace(go.Scatter(x=times, y=torques[0],
	            mode='markers', 
                    name='Torques'))
    fig.add_trace(go.Scatter(x=times, y=actions[0],
	            mode='markers', 
                    name='Actions'))

    st.plotly_chart(fig)

And it’s plotting data for one leg.

If this is just 5 seconds of simulation, then velocities looks like it might be the closest match. You can imagine it going up a bit, back a bit, then a big step forward.

So, one idea is to do symbolic regression, to approximate the trigonometry equations for quadrupedal walking, (or just google them), and generalise to a walking algorithm, to use for locomotion. I could use genetic programming, like at university (https://gplearn.readthedocs.io/en/stable/examples.html#symbolic-regressor). But that’s overkill and probably won’t work. Gotta smooth the graph incrementally. Normalize it.

Let’s see what happens next, visually, after 5 seconds of data, and then view the same, for other legs.

Ok there is 30 seconds of walking.

The tools I wrote for the walker, are run with ‘python3 play_tune.py –replay 1’. It looks for the best checkpoint and replays it from there.

But now I seem to be getting the same graph for different legs. What? We’re going to have to investigate.

Ok turns out [[]]*4 is the wrong way to initialise arrays in python. It makes all sublists the same. Here’s the correct way:

velocities = [[] for i in range(4)]

Now I have 4 different legs.

The graph is very spiky, so I’ve added a rolling window average, and normalised it between -1 and 1 since that’s what the servo throttle allows.

I am thinking that maybe because the range between min and max for the 4 legs are:

3.1648572819886085
1.7581604444983845
5.4736002843351805
1.986915632875287

The rear legs aren’t moving as much, so maybe it doesn’t make sense to normalize them all to [-1, 1] all on the same scale. Like maybe the back right leg that moves so much should be normalized to [-1, 1] and then all the other legs are scaled down proportionally. Anyway, let’s see. Good enough for now.

In the code, the motors order is:

front right, front left, back right, back left.

Ok so to save the outputs…

import pandas as pd
 import numpy as np
 import streamlit as st
 import time
 from plotly import graph_objects as go
 import os
 import inspect
 from google.protobuf.json_format import MessageToJson
 import argparse
 from gym_robotable.envs import logging
 import plotly.express as px
 currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
 parentdir = os.path.dirname(os.path.dirname(currentdir))
 os.sys.path.insert(0, parentdir)
 def normalize_negative_one(img):
     normalized_input = (img - np.amin(img)) / (np.amax(img) - np.amin(img))
     return 2*normalized_input - 1
 if name == "main":
     st.title('Analyticz')
     parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
     parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/walkinglogs/robotable_log_2021-01-17-231240')
     args = parser.parse_args()
     logging = logging.RobotableLogging()
     episode_proto = logging.restore_episode(args.log_file)
     times = []
     velocities = [[] for i in range(4)]
     for step in range(len(episode_proto.state_action)):
        step_log = episode_proto.state_action[step]
        times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos))
        for i in range(4):
            velocities[i].append(step_log.motor_states[i].velocity)
     #truncate because a bunch of trailing zeros
     velocities[0] = velocities[0][0:3000]
     velocities[1] = velocities[1][0:3000]
     velocities[2] = velocities[2][0:3000]
     velocities[3] = velocities[3][0:3000]
     times = times[0:3000]
     #get moving averages
     window_size_0=40
     numbers_series_0 = pd.Series(velocities[0])
     windows_0 = numbers_series_0.rolling(window_size_0)
     moving_averages_0 = windows_0.mean()
     moving_averages_list_0 = moving_averages_0.tolist()
     without_nans_0 = moving_averages_list_0[window_size_0 - 1:]
     window_size_1=40
     numbers_series_1 = pd.Series(velocities[1])
     windows_1 = numbers_series_1.rolling(window_size_1)
     moving_averages_1 = windows_1.mean()
     moving_averages_list_1 = moving_averages_1.tolist()
     without_nans_1 = moving_averages_list_1[window_size_1 - 1:]
     window_size_2=40
     numbers_series_2 = pd.Series(velocities[2])
     windows_2 = numbers_series_2.rolling(window_size_2)
     moving_averages_2 = windows_2.mean()
     moving_averages_list_2 = moving_averages_2.tolist()
     without_nans_2 = moving_averages_list_2[window_size_2 - 1:]
     window_size_3=40
     numbers_series_3 = pd.Series(velocities[3])
     windows_3 = numbers_series_3.rolling(window_size_3)
     moving_averages_3 = windows_3.mean()
     moving_averages_list_3 = moving_averages_3.tolist()
     without_nans_3 = moving_averages_list_3[window_size_3 - 1:]
     #normalize between -1 and 1
     avg_0 = np.asarray(without_nans_0)
     avg_1 = np.asarray(without_nans_1)
     avg_2 = np.asarray(without_nans_2)
     avg_3 = np.asarray(without_nans_3)
     avg_0 = normalize_negative_one(avg_0)
     avg_1 = normalize_negative_one(avg_1)
     avg_2 = normalize_negative_one(avg_2)
     avg_3 = normalize_negative_one(avg_3)
     np.save('velocity_front_right', avg_0)
     np.save('velocity_front_left', avg_1)
     np.save('velocity_back_right', avg_2)
     np.save('velocity_back_left', avg_3)
     np.save('times', times)
     # Create traces
     fig0 = go.Figure()
     fig0.add_trace(go.Scatter(x=times, y=velocities[0],
                 mode='lines',
                 name='Velocities 0'))
     fig0.add_trace(go.Scatter(x=times, y=avg_0.tolist(),
                 mode='lines',
                 name='Norm Moving Average 0'))
     st.plotly_chart(fig0)
     fig1 = go.Figure()
     fig1.add_trace(go.Scatter(x=times, y=velocities[1],
                 mode='lines',
                 name='Velocities 1'))
     fig1.add_trace(go.Scatter(x=times, y=avg_1.tolist(),
                 mode='lines',
                 name='Norm Moving Average 1'))
     st.plotly_chart(fig1)
     fig2 = go.Figure()
     fig2.add_trace(go.Scatter(x=times, y=velocities[2],
                 mode='lines',
                 name='Velocities 2'))
     fig2.add_trace(go.Scatter(x=times, y=avg_2.tolist(),
                 mode='lines',
                 name='Norm Moving Average 2'))
     st.plotly_chart(fig2)
     fig3 = go.Figure()
     fig3.add_trace(go.Scatter(x=times, y=velocities[3],
                 mode='lines',
                 name='Velocities 3'))
     fig3.add_trace(go.Scatter(x=times, y=avg_3.tolist(),
                 mode='lines',
                 name='Norm Moving Average 3'))
     st.plotly_chart(fig3)

(Excuse the formatting.) Then I’m loading those npy files and iterating them to the motors.

import time
 import numpy as np
 from board import SCL, SDA
 import busio
 from adafruit_pca9685 import PCA9685
 from adafruit_motor import servo
 i2c = busio.I2C(SCL, SDA)
 pca = PCA9685(i2c, reference_clock_speed=25630710)
 pca.frequency = 50
 servo0 = servo.ContinuousServo(pca.channels[0], min_pulse=685, max_pulse=2280)
 servo1 = servo.ContinuousServo(pca.channels[1], min_pulse=810, max_pulse=2095)
 servo2 = servo.ContinuousServo(pca.channels[2], min_pulse=700, max_pulse=2140)
 servo3 = servo.ContinuousServo(pca.channels[3], min_pulse=705, max_pulse=2105)
 velocity_front_right = np.load('velocity_front_right.npy')
 velocity_front_left = np.load('velocity_front_left.npy')
 velocity_back_right = np.load('velocity_back_right.npy')
 velocity_back_left = np.load('velocity_back_left.npy')
 times = np.load('times.npy')
 reverse left motors
 velocity_front_left = -velocity_front_left
 velocity_back_left = -velocity_back_left
 print (velocity_front_right.size)
 print (velocity_front_left.size)
 print (velocity_back_right.size)
 print (velocity_back_left.size)
 print (times.size)
 for time in times:
    print(time)
 for a,b,c,d in np.nditer([velocity_front_right, velocity_front_left, velocity_back_right, velocity_back_left]):
     servo0.throttle = a/4
     servo1.throttle = b/4
     servo2.throttle = c/4
     servo3.throttle = d/4
    print (a, b, c, d)
 servo0.throttle = 0
 servo1.throttle = 0
 servo2.throttle = 0
 servo3.throttle = 0
 pca.deinit()

Honestly it doesn’t look terrible, but these MG996R continuous rotation servos are officially garbage.

While we wait for new servos to arrive, I’m testing on SG90s. I’ve renormalized about 90 degrees

def normalize_0_180(img):
     normalized_0_180 = (180*(img - np.min(img))/np.ptp(img)).astype(int)   
     return normalized_0_180

That was also a bit much variation, and since we’ve got 180 degree servos now, but it still looks a bit off, I halved the variance in the test.

 velocity_front_right = ((velocity_front_right - 90)/2)+90
 velocity_front_left = ((velocity_front_left - 90)/2)+90
 velocity_back_right = ((velocity_back_right - 90)/2)+90
 velocity_back_left = ((velocity_back_left - 90)/2)+90

# reverse left motors
 velocity_front_left = 180-velocity_front_left
 velocity_back_left = 180-velocity_back_left

lol. ok. Sim2Real.

So, it’s not terrible, but we’re not quite there either. Also i think it’s walking backwards.

I am not sure the math is correct.

I changed the smoothing code to use this code which smoothes based on the preceding plot.

def anchor(signal, weight):
     buffer = []
     last = signal[0]
     for i in signal:
         smoothed_val = last * weight + (1 - weight) * i
         buffer.append(smoothed_val)
         last = smoothed_val
     return buffer
Derp.

OK i realised I was wrong all along. Two things.

First, I just didn’t see that the angles values were on that original graph. They were so small. Of course we’re supposed to use the angles, rather than the velocities, for 180 degree servos.

Second problem was, I was normalizing from min to max of the graph. Of course it should be -PI/2 to PI/2, since the simulator works with radians, obviously. Well anyway, hindsight is 20/20. Now we have a fairly accurate sim2real. I use the anchor code above twice, to get a really smooth line.

Here’s the final code.

import pandas as pd
 import numpy as np
 import streamlit as st
 import time
 from plotly import graph_objects as go
 import os
 import inspect
 from google.protobuf.json_format import MessageToJson
 import argparse
 from gym_robotable.envs import logging
 import plotly.express as px
 currentdir = os.path.dirname(os.path.abspath(inspect.getfile(inspect.currentframe())))
 parentdir = os.path.dirname(os.path.dirname(currentdir))
 os.sys.path.insert(0, parentdir)
 def anchor(signal, weight):
     buffer = []
     last = signal[0]
     for i in signal:
         smoothed_val = last * weight + (1 - weight) * i
         buffer.append(smoothed_val)
         last = smoothed_val
     return buffer
 assume radians
 def normalize_0_180(img):
     normalized_0_180 = np.array(img)*57.2958 + 90
     return normalized_0_180
 if name == "main":
     st.title('Analyticz')
     parser = argparse.ArgumentParser(formatter_class=argparse.ArgumentDefaultsHelpFormatter)
     parser.add_argument('--log_file', help='path to protobuf file', default='/media/chrx/0FEC49A4317DA4DA/walkinglogs/robotable_log_2021-01-17-231240')
     args = parser.parse_args()
     logging = logging.RobotableLogging()
     episode_proto = logging.restore_episode(args.log_file)
     times = []
     angles = [[] for i in range(4)]
     for step in range(len(episode_proto.state_action)):
        step_log = episode_proto.state_action[step]
        times.append(str(step_log.time.seconds) + '.' + str(step_log.time.nanos))
        for i in range(4):
            print (step)
            print (step_log.motor_states[i].angle)
            angles[i].append(step_log.motor_states[i].angle)
     #truncate because a bunch of trailing zeros
     angles[0] = angles[0][0:3000]
     angles[1] = angles[1][0:3000]
     angles[2] = angles[2][0:3000]
     angles[3] = angles[3][0:3000]
     avg_0 = normalize_0_180(angles[0])
     avg_1 = normalize_0_180(angles[1])
     avg_2 = normalize_0_180(angles[2])
     avg_3 = normalize_0_180(angles[3])
     avg_0 = anchor(avg_0, 0.8)
     avg_1 = anchor(avg_1, 0.8)
     avg_2 = anchor(avg_2, 0.8)
     avg_3 = anchor(avg_3, 0.8)
     avg_0 = anchor(avg_0, 0.8)
     avg_1 = anchor(avg_1, 0.8)
     avg_2 = anchor(avg_2, 0.8)
     avg_3 = anchor(avg_3, 0.8)
     avg_0 = anchor(avg_0, 0.8)
     avg_1 = anchor(avg_1, 0.8)
     avg_2 = anchor(avg_2, 0.8)
     avg_3 = anchor(avg_3, 0.8)
     np.save('angle_front_right_180', avg_0)
     np.save('angle_front_left_180', avg_1)
     np.save('angle_back_right_180', avg_2)
     np.save('angle_back_left_180', avg_3)
     # Create traces
     fig0 = go.Figure()
     fig0.add_trace(go.Scatter(x=times, y=angles[0],
                 mode='lines',
                 name='Angles 0'))
     fig0.add_trace(go.Scatter(x=times, y=avg_0,
                  mode='lines',
                  name='Norm Moving Average 0'))
     st.plotly_chart(fig0)
     fig1 = go.Figure()
     fig1.add_trace(go.Scatter(x=times, y=angles[1],
                 mode='lines',
                 name='Angles 1'))
     fig1.add_trace(go.Scatter(x=times, y=avg_1,
                  mode='lines',
                  name='Norm Moving Average 1'))
     st.plotly_chart(fig1)
     fig2 = go.Figure()
     fig2.add_trace(go.Scatter(x=times, y=angles[2],
                 mode='lines',
                 name='Angles 2'))
     fig2.add_trace(go.Scatter(x=times, y=avg_2,
                  mode='lines',
                  name='Norm Moving Average 2'))
     st.plotly_chart(fig2)
     fig3 = go.Figure()
     fig3.add_trace(go.Scatter(x=times, y=angles[3],
                 mode='lines',
                 name='Angles 3'))
     fig3.add_trace(go.Scatter(x=times, y=avg_3,
                  mode='lines',
                  name='Norm Moving Average 3'))
     st.plotly_chart(fig3)

OK.

So there’s a milestone that took way too long. We’ve got Sim 2 Real working, ostensibly.

After some fortuitous googling, I found the Spot Micro, or, Spot Mini Mini project. The Spot Micro guys still have a big focus on inverse kinematics, which I’m trying to avoid for as long as I can.

They’ve done a very similar locomotion project using pyBullet, and I was able to find a useful paper, in the inspiration section, alerting me to kMPs.

Kinematic Motion Primitives. It’s a similar idea to what I did above.

Instead, what these guys did was to take a single wave of their leg data, and repeat that, and compare that to a standardized phase. (More or less). Makes sense. Looks a bit complicated to work out the phase of the wave in my case.

I’ll make a new topic, and try to extract kMPs from the data, for the next round of locomotion sim2real. I will probably also train the robot for longer, to try evolve a gait that isn’t so silly.

Categories
sexing

RETINAL U-NET

leigaoyi

https://github.com/leigaoyi/retinal-Dense-Unet