For the MFRU exhibition, we presented a variety of robots. The following is some documentation, on the specifications, and setup instructions. We are leaving the robots with konS.
All Robots
Li-Po batteries need to be stored at 3.8V per cell. For exhibition, they can be charged to 4.15A per cell, and run with a battery level monitor until they display 3.7V, at which point they should be swapped out. Future iterations of robotic projects will make use of splitter cables to allow hot swapping batteries, for zero downtime.
We leave our ISDT D2 Mark 2 charger, for maintaining and charging Li-Po batteries.
At setup time, in a new location, Raspberry Pi SD cards need to be updated to connect to the new Wi-fi network. Simplest method is to physically place the SD card in a laptop, and transfer a wpa_supplicant.conf file with the below changed to the new credentials and locale, and a blank file called ssh, to allow remote login.
Then following startup with the updated SD card, robot IP addresses need to be determined, typically using `nmap -sP 192.168.xxx.xxx`, (or a windows client like ZenMap).
Usernames and passwords used are:
LiDARbot – pi/raspberry
Vacuumbot – pi/raspberry and chicken/chicken
Pinkbot – pi/raspberry
Gripperbot – pi/raspberry
Birdbot – daniel/daniel
Nipplebot – just arduino
Lightswitchbot – just arduino and analog timer
For now, it is advised to shut down robots by connecting to their IP address, and typing sudo shutdown -H now and waiting for the lights to turn off, before unplugging. It’s not 100% necessary, but it reduces the chances that the apt cache becomes corrupted, and you need to reflash the SD card and start from scratch.
Starting from scratch involves reflashing the SD card using Raspberry Pi Imager, cloning the git repository, running pi_boot.sh and pip3 install -y requirements.txt, configuring config.py, and running create_service.sh to automate the startup.
LiDARbot
Raspberry Pi Zero W x 1 PCA9685 PWM controller x 1 RPLidar A1M8 x 1 FT5835M servo x 4
Powered by: Standard 5V Power bank [10Ah and 20Ah]
Startup Instructions: – Plug in USB cables. – Wait for service startup and go to URL. – If Lidar chart is displaying, click ‘Turn on Brain’
Vacuumbot
Raspberry Pi 3b x 1 LM2596 stepdown converter x 1 RDS60 servo x 4
Powered by: 7.4V 4Ah Li-Po battery
NVIDIA Jetson NX x 1 Realsense D455 depth camera x 1
Powered by: 11.1V 4Ah Li-Po battery
Instructions: – Plug Jetson assembly connector into 11.4V, and RPi assembly connector into 7.4V – Connect to Jetson:
cd ~/jetson-server
python3 inference_server.py
– Go to the Jetson URL to view depth and object detection. – Wait for Rpi service to start up. – Connect to RPi URL, and click ‘Turn on Brain’
Pinkbot
Raspberry Pi Zero W x 1 PCA9685 PWM controller x 1 LM2596 stepdown converter x 1 RDS60 servo x 8 Ultrasonic sensors x 3
Powered by: 7.4V 6.8Ah Li-Po battery
Instructions: – Plug in to Li-Po battery – Wait for Rpi service to start up. – Connect to RPi URL, and click ‘Turn On Brain’
Gripperbot
Raspberry Pi Zero W x 1 150W stepdown converter (to 7.4V) x 1 LM2596 stepdown converter (to 5V) x 1 RDS60 servo x 4 MGGR996 servo x 1
Powered by: 12V 60W power supply
Instructions: – Plug in to wall – Wait for Rpi service to start up. – Connect to RPi URL, and click ‘Fidget to the Waves’
Birdbot
Raspberry Pi Zero W x 1 FT SM-85CL-C001 servo x 4 FE-URT-1 serial controller x 1 12V input step-down converter (to 5V) x 1 Ultrasonic sensor x 1 RPi camera v2.1 x 1
Powered by: 12V 60W power supply
Instructions: – Plug in to wall – Wait for Rpi service to start up. – Connect to RPi URL, and click ‘Fidget to the Waves’
I soldered on some extra wires to the motor + and -, to power the motor separately.
Wasn’t getting any luck, but it turned out to be the MicroUSB cable (The OTG cable was ok). After swapping it out, I was able to run the simple_grabber app and confirm that data was coming out.
I debugged the Adafruit v1.29 issue too. So now I’m able to get the data in python, which will probably be nicer to work with, as I haven’t done proper C++ in like 20 years. But this Slamtec code would be the cleanest example to work with.
So I added in some C socket code and recompiled, so now the demo app takes a TCP connection and starts dumping data.
It was actually A LOT faster than the python libraries. But I started getting ECONNREFUSED errors, which I thought might be because the Pi Zero W only has a single CPU, and the Python WSGI worker engine was eventlet, which only handles 1 worker, for flask-socketio, and running a socket server, client, and socket-io, on a single CPU, was creating some sort of resource contention. But I couldn’t solve it.
I found a C++ python-wrapped project but it was compiled for 64 bit, and the software, SWIG, which I needed to recompile for 32 bit, seemed a bit complicated.
So, back to Python.
Actually, back to javascript, to get some visuals in a browser. The Adafruit example is for pygame, but we’re over a network, so that won’t work. Rendering Matplotlib graphs is going to be too slow. Need to stream data, and render it on the front end.
Detour #1: NPM
Ok… so, need to install Node.js to install this one, which for Raspberry Pi Zero W, is ARM6.
This is the most recent ARM6 nodejs tarball:
wget https://nodejs.org/dist/latest-v11.x/node-v11.15.0-linux-armv6l.tar.gz
tar xzvf node-v11.15.0-linux-armv6l.tar.gz
cd node-v11.15.0-linux-armv6l
sudo cp -R * /usr/local/
sudo ldconfig
npm install --global yarn
sudo npm install --global yarn
npm install rplidar
npm ERR! serialport@4.0.1 install: `node-pre-gyp install --fallback-to-build`
Ok... never mind javascript for now.
Detour #2: Dash/Plotly
Let’s try this python code. https://github.com/Hyun-je/pyrplidar
Ok well it looks like it works maybe, but where is s/he getting that nice plot from? Not in the code. I want the plot.
So, theta and distance are just polar coordinates. So I need to plot polar coordinates.
PolarToCartesian.
Convert a polar coordinate (r,θ) to cartesian (x,y): x = r cos(θ), y = r sin(θ)
pip3 install pandas
pip3 install dash
k let's try...
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 48 from C header, got 40 from PyObject
ok
pip3 install --upgrade numpy
(if your numpy version is < 1.20.0)
ok now bad marshal data (unknown type code)
sheesh, what garbage.
Posting issue to their github and going back to the plan.
Reply from Plotly devs: pip3 won’t work, will need to try conda install, for ARM6
Ok let’s see if we can install plotly again….
Going to try miniconda – they have a arm6 file here…
Damn. 2014. Python 2. Nope. Ok Plotly is not an option for RPi Zero W. I could swap to another RPi, but I don’t think the 1A output of the power bank can handle it, plus camera, plus lidar motor, and laser. (I am using the 2.1A output for the servos).
Solution #1: D3.js
Ok, Just noting this link, as it looks useful for the lidar robot, later.
“eventlet is the best performant option, with support for long-polling and WebSocket transports.”
apparently needs redis for message queueing…
pip install eventlet
pip install redis
Ok, and we need gunicorn, because eventlet is just for workers...
pip3 install gunicorn
gunicorn --worker-class eventlet -w 1 module:app
k, that throws an error.
I need to downgrade eventlet, or do some complicated thing.
pip install eventlet==0.30.2
gunicorn --bind 0.0.0.0 --worker-class eventlet -w 1 kmp8servo:app
(my service is called kmp8servo.py)
ok so do i need redis?
sudo apt-get install redis
ok it's already running now,
at /usr/bin/redis-server 127.0.0.1:6379
no, i don't really need redis. Could use sqlite, too. But let's use it anyway.
Ok amazing, gunicorn works. It's running on port 8000
Ok, after some work, socket-io is also working.
Received #0: Connected
Received #1: I'm connected!
Received #2: Server generated event
Received #3: Server generated event
Received #4: Server generated event
Received #5: Server generated event
Received #6: Server generated event
Received #7: Server generated event
So, I’m going to go with d3.js instead of P5js, just cause it’s got a zillion more users, and there’s plenty of polar coordinate code to look at, too.
Got it drawing the polar background… but I gotta change the scale a bit. The code uses a linear scale from 0 to 1, so I need to get my distances down to something between 0 and 1. Also need radians, instead of the degrees that the lidar is putting out.
ok finally. what an ordeal.
But now we still need to get python lidar code working though, or switch back to the C socket code I got working.
Ok, well, so I added D3 update code with transitions, and the javascript looks great.
But the C Slamtec SDK, and the Python RP Lidar wrappers are a source of pain.
I had the C sockets working briefly, but it stopped working, seemingly while I added more Python code between each socket read. I got frustrated and gave up.
The Adafruit library, with the fixes I made, seem to work now, but it’s in a very precarious state, where looking at it funny causes a bad descriptor field, or checksum error.
But I managed to get the brain turning on, with the lidar. I’m using Redis to track the variables, using the memory.py code from this K9 repo. Thanks.
I will come back to trying to fix the remaining python library issues, but for now, the robot is running, so, on to the next.
Unfortunately the results of using the OpenCV/GStreamer example code to transmit over the network using H264 compression, were even worse than the JPEG over HTTP attempt I’m trying to improve on. Much worse. That was surprising. It could be this wifi dongle though, which is very disappointing on the Jetson Nano. It appears as though the Jetson Nano tries to keep total wattage around 10W, however plugging in the Realsense camera and a wifi dongle is pulling way more than that (All 4A @ 5W supplied by the barrel jack). It may mean that wireless robotics with the Realsense is not practical, on the Jetson.
Required apt install gstreamer1.0-rtsp to be installed.
Back to drawing board for getting the RealSense colour and depth transmitting to a different viewing machine, on the network (while still providing distance data for server side computation).
After reading up on IMUs, you get 3 axes: accelerometer, 6 axes: + gyroscope, 9 axes: + magnetometer. And some 10 axes ones, if it’s fancy, and has a thermometer to correct inaccuracies, etc.
6 axes gives you relative positions, 9 axes gives you absolute positions.
I happen to have a 6 axis one, from Aliexpress, from years ago. Never used it, but now I have a reason. It’s labelled GY-521. Here’s a video tutorial on putting it all together, with the tutorial link for reading.
“the 6-DoF, which was used to determine the linear velocity, angular velocity, position and orientation.” – this paper, “Pose Estimation of a Mobile Robot Based on Fusion of IMU Data and Vision Data Using an Extended Kalman Filter”
You need to take these folders from the github link.
and put them in your Arduino libs folder
The github also has some code for Raspberry Pi, which I might get to next. Badabing badaboom, it actually worked first time. ( I touched the USB cable though, and needed to restart it, but that seems like something that can be prevented).
Ok so accelerometer x/y/z, temperature nice, gyroscope x/y/z
I watched these numbers as I moved it around, and at 9600 baud, it’s really slow. It’s not going to help for real time decision making.
Maybe we’ll come back to IMUs later. A bit complicated to visualise and make sense of the data, but with a visualisation, it would make odometry SLAM methods more robust.
We need the robot to be somewhat intelligent, and that means some rules, or processes based on sensor input. Originally I was doing everything in PyBullet, and that leads to techniques like PyBullet-Planning. But as we’re getting closer to a deadline, simpler ideas are winning out.
I came across this paper, and figures 1 and 2 give an idea.
WordPress apparently won’t show this image in original resolution, but you can look up close in the PDF.
Here’s a fuzzy controller: FUZZY LOGIC CONTROL FOR ROBOT MAZE TRAVERSAL: AN UNDERGRADUATE CASE STUDY
I like that idea too. Fuzzifying concepts, defuzzifying into adjusting towards descriptions from rules
This is probably starting to get too many acronyms for me, but Goal Reasoning is a pretty fascinating topic. Now that I see it, of course NASA has been working on this stuff for ages.
Excerpted below (Doychev, 2021):
The ”C” Language Production System (CLIPS) is a portable, rule-based production system [Wyg89]. It was first developed for NASA as an expert system and uses forward chaining inference based on the Rete algorithm consisting of three building blocks[JCG]:
Fact List: The global memory of the agent. It is used as a container to store basic pieces of information about the world in the form of facts, which are usually of specific types. The fact list is constantly updated using the knowledge in the knowledge base.
Knowledge Base: It comprises heuristic knowledge in two forms:
• Procedural Knowledge: An operation that leads to a certain effect. These can, for example, modify the fact base. Functions carry procedural knowledge and can also be implemented in C++. They are mainly used for the utilization of the agent, such as communication to a behavior engine. An example for procedural knowledge would be a function that calls a robot-arm driver to grasp at a target location, or a fact base update reflecting a sensor reading.
• Rules: Rules play an important role in CLIPS. They can be compared to IF-THEN statements in procedural languages. They consist of several preconditions, that need to be satisfied by the current fact list for the rule to be activated, and effects in the form of procedural knowledge. When all its preconditions are satisfied, a rule is added to the agenda, which executes all the activated rules subsequently by firing the corresponding procedural knowledge.
Inference Engine: The main controlling block. It decides which rules should be executed and when. Based on the knowledge base and the fact base, it guides the execution of agenda and rules, and updates the fact base, if needed. This is performed until a stable state is reached, meaning, there are no more activated rules. The inference engine supports different conflict resolution strategies, as multiple rules can be active at a time. For example, rules are ordered by their salience, a numeric value where a higher value means higher priority. If rules with the same salience are active at a time, they are executed in the order of their activation.
CLIPS Executive The CLIPS Executive (CX) is a CLIPS-based production system which serves as a high-level controller, managing all the high-level decision making. Its main tasks involve goal formation, goal reasoning, on-demand planner invocation, goal execution and monitoring, world and agent memory (a shared database for multiple agents) information synchronization. In general, this is achieved by individual CLIPS structures (predefined facts, rules, etc.), that get added to the CLIPS environment.
It’s the Rete algorithm, so it’s a rule engine. It’s a cool algorithm. If you don’t know about rule engines, they are what you use when you start to have more than 50 ‘if’ statements.
Ok, that’s all I need to know. I’ve used KIE professionally. I don’t want to use Java in this project. There appear to be some simplified Python Rule Engines, and so I’ll check them out, when I have some sensor input.
I think I’m going to try this one. They snagged ‘rule-engine’ at PyPi, so they must be good.
Ok, I’ve set up an Arduino with three ultrasonic distance sensors, and it’s connected to the Raspberry Pi. I should do a write-up on that. So I can poll the Arduino and get ‘forward left’, ‘forward’ and ‘forward right’ ultrasonic sensor distance back, in JSON.
I think for now, a good start would be having consequences of, forward, backwards, left, right, and stand still.
These are categories of motions. Motions have names, so far, so we will just categorize motions by their name, by whether they contain one of these cardinal motions (forward or walk, back, left, right, still) in their name.
To keep things interesting, the robot can pick motions from these categories at random. I was thinking of making scripts, to join together motions, but I’m going to come back to that later. Scripts would just be sequences of motions, so it’s not strictly necessary, since we’re going to use a rule engine now.
Ok… after thinking about it, screw the rule engine. We only need like 10 rules, at the moment, and I just need to prioritize the rules, and reevaluate often. I’ll just write them out, with a basic prioritisation.
I also see an interesting hybrid ML / Rule option from sci-kit learn.
Anyway, too complicated for now. So this would all be in a loop.
TOO_CLOSE=30
priority = 0
# High Priority '0'
if F < TOO_CLOSE:
runMotionFromCategory("BACK")
priority = 1
if L < TOO_CLOSE:
runMotionFromCategory("BACK")
runMotionFromCategory("RIGHT")
priority = 1
if R < TOO_CLOSE:
runMotionFromCategory("BACK")
runMotionFromCategory("LEFT")
priority = 1
# Medium Priority '1' (priority is still 0)
if priority == 0 and L < R and L < F:
runMotionFromCategory("RIGHT")
priority = 2
if priority == 0 and R < L and R < F:
runMotionFromCategory("LEFT")
priority = 2
# Low Priority '2' (priority is still 0)
if priority == 0 and L < F and R < F:
runMotionFromCategory("WALK")
priority = 3
Good enough. So I still want this to be part of the UI though, and so the threading, and being able to exit the loop will be important.
Basically, the problem is, how to implement a stop button, in HTTP / Flask. Need a global variable, basically, which can be changed. But a global variable in a web app? Sounds like a bad idea. We have session variables, but the thread that’s running the motion is in the past, and probably evaluating a different session map. Maybe not though. Will need to research and try a few things.
Yep… “Flask provides you with a special object that ensures it is only valid for the active request and that will return different values for each request.”
Ok, Flask has ‘g’ …?
from flask import g
user = getattr(flask.g, 'user', None)
user = flask.g.get('user', None)
Hmm ok that’s not right. It’s not sharable between requests, apparently. There’s Flask-Cache…. that might work? Ok I think this is it maybe.
Now, to run the brain. I don’t know how to reuse code with these inner functions. So it’s copy-paste time. The brain will need to be a single thread. So something like
@app.route('/runBrain', methods=['POST', 'GET'])
def runBrain():
@copy_current_request_context
def runBrainTask(data):
@copy_current_request_context
def runMotionFromCategory(category):
...
...
if x then runMotionFromCategory("BACK")
if y then runMotionFromCategory("LEFT")
(start runBrainTask thread)
Ok let’s try it…
Ok first need to fix the motions.
Alright. It’s working! Basic Planning working.
It can shuffle around now, without bumping into walls. Further testing will be needed. But a good basis for a robot.
Just a quick write-up on the incubator I made, for the fertilized eggs we picked up recently. They were R5 each (€0.28). The goal is to maintain a temperature of 37.5C.
The incubator consists of:
Arduino Nano
DS18B20 temperature sensor
4.7K resistor
Solid state relay module
12V power supply (for the Arduino, and relay)
A relatively inefficient 20W light, on a heatsink.
Some aluminium foil, in a mostly closed plastic container
The eggs were placed a bit above the heatsink, in an egg carton, (hot air rises.) The idea being that we only want conductive heat, not radiative heat, since radiative heat is directional, and will heat one side of the egg more than the other side.
The Arduino code was adapted from the example code from the OneWire library. The controller measures the temperature at the eggs, and turns on the light when the temperature is below 37.5C, and turns it off when it’s above. Using a separate temperature sensor, we confirmed that the temperature remains within a degree of the desired temperature.
There are better ways to do this, I’m sure, but this is what I came up with, on a day’s notice, with the parts available.
The gotchas encountered were, that the Chinese Arduino Nano I used required selecting to upload the sketch, for the ‘old’ bootloader, and the wiring colours of the DS18B20 were incorrectly labelled, (as it was for this forum user).
#include <OneWire.h>
// OneWire DS18S20, DS18B20, DS1822 Temperature Example
//
// http://www.pjrc.com/teensy/td_libs_OneWire.html
//
// The DallasTemperature library can do all this work for you!
// https://github.com/milesburton/Arduino-Temperature-Control-Library
#define SENSOR 2
#define LIGHT 4
OneWire ds(SENSOR); // on pin 2 (a 4.7K resistor is necessary)
float desired_temp = 37.5;
float light_status = LOW;
void control(float temperature){
if (temperature >= desired_temp)
{
light_status = LOW;
Serial.println("HEATER OFF");
}
else
{
light_status = HIGH;
Serial.println("HEATER ON");
}
digitalWrite(LIGHT, light_status);
}
void setup(void) {
Serial.begin(9600);
pinMode(LIGHT, OUTPUT);
}
void loop(void) {
byte i;
byte present = 0;
byte type_s;
byte data[12];
byte addr[8];
float celsius, fahrenheit;
if ( !ds.search(addr)) {
Serial.println("No more addresses.");
Serial.println();
ds.reset_search();
delay(250);
return;
}
Serial.print("ROM =");
for( i = 0; i < 8; i++) {
Serial.write(' ');
Serial.print(addr[i], HEX);
}
if (OneWire::crc8(addr, 7) != addr[7]) {
Serial.println("CRC is not valid!");
return;
}
Serial.println();
// the first ROM byte indicates which chip
switch (addr[0]) {
case 0x10:
Serial.println(" Chip = DS18S20"); // or old DS1820
type_s = 1;
break;
case 0x28:
Serial.println(" Chip = DS18B20");
type_s = 0;
break;
case 0x22:
Serial.println(" Chip = DS1822");
type_s = 0;
break;
default:
Serial.println("Device is not a DS18x20 family device.");
return;
}
ds.reset();
ds.select(addr);
ds.write(0x44, 1); // start conversion, with parasite power on at the end
delay(1000); // maybe 750ms is enough, maybe not
// we might do a ds.depower() here, but the reset will take care of it.
present = ds.reset();
ds.select(addr);
ds.write(0xBE); // Read Scratchpad
Serial.print(" Data = ");
Serial.print(present, HEX);
Serial.print(" ");
for ( i = 0; i < 9; i++) { // we need 9 bytes
data[i] = ds.read();
Serial.print(data[i], HEX);
Serial.print(" ");
}
Serial.print(" CRC=");
Serial.print(OneWire::crc8(data, 8), HEX);
Serial.println();
// Convert the data to actual temperature
// because the result is a 16 bit signed integer, it should
// be stored to an "int16_t" type, which is always 16 bits
// even when compiled on a 32 bit processor.
int16_t raw = (data[1] << 8) | data[0];
if (type_s) {
raw = raw << 3; // 9 bit resolution default
if (data[7] == 0x10) {
// "count remain" gives full 12 bit resolution
raw = (raw & 0xFFF0) + 12 - data[6];
}
} else {
byte cfg = (data[4] & 0x60);
// at lower res, the low bits are undefined, so let's zero them
if (cfg == 0x00) raw = raw & ~7; // 9 bit resolution, 93.75 ms
else if (cfg == 0x20) raw = raw & ~3; // 10 bit res, 187.5 ms
else if (cfg == 0x40) raw = raw & ~1; // 11 bit res, 375 ms
//// default is 12 bit resolution, 750 ms conversion time
}
celsius = (float)raw / 16.0;
fahrenheit = celsius * 1.8 + 32.0;
Serial.print(" Temperature = ");
Serial.print(celsius);
Serial.print(" Celsius, ");
Serial.print(fahrenheit);
Serial.println(" Fahrenheit");
control(celsius);
}
As the eggs reach maturity, we’ll get a ‘hatcher’ environment ready.
Here I’m continuing with the task of unsupervised detection of audio anomalies, hopefully for the purpose of detecting chicken stress vocalisations.
After much fussing around with the old Numenta NuPic codebase, I’m porting the older nupic.audio and nupic.critic code, over to the more recent htm.core.
These are the main parts:
Sparse Distributed Representation (SDR)
Encoders
Spatial Pooler (SP)
Temporal Memory (TM)
I’ve come across a very intricate implementation and documentation, about understanding the important parts in the HTM model, way deep, like how did I get here? I will try implement the ‘critic’ code, first. Or rather, I’ll try port it from nupic to htm. After further investigation, there’s a few options, and I’m going to try edit the hotgym example, and try shove wav files frequency band scalars through it instead of power consumption data. I’m simplifying the investigation. But I need to make some progress.
I’m using this docker to get in, mapping my code and wav file folder in:
docker run -d -p 8888:8888 --name jupyter -v /media/chrx/0FEC49A4317DA4DA/sounds/:/home/jovyan/work 3rdman/htm.core-jupyter:latest
So I've got some code working that writes to 'nupic format' (.csv) and code that reads the amplitudes from the csv file, and then runs it through htm.core.
So it takes a while, and it's just for 1 band (of 10 bands). I see it also uses the first 1/4 of so of the time to know what it's dealing with. Probably need to run it through twice to get predictive results in the first 1/4.
Ok no, after a few weeks, I've come back to this point, and realise that the top graph is the important one. Prediction is what's important. The bottom graphs are the anomaly scores, used by the prediction.
The idea in nupic.critic, was to threshold changes in X bands. Let’s see the other graphs…
Ok Frequency bands 7, 8, 9 were all zero amplitude. So that’s the highest the frequencies went. Just gotta check what those frequencies are, again…
Opening 307.wav
Sample width (bytes): 2
Frame rate (sampling frequency): 48000
Number of frames: 20771840
Signal length: 20771840
Seconds: 432
Dimensions of periodogram: 4801 x 2163
Ok with 10 buckets, 4801 would divide into
Frequency band 0: 0-480Hz
Frequency band 1: 480-960Hz
Frequency band 2: 960-1440Hz
Frequency band 3: 1440-1920Hz
Frequency band 4: 1920-2400Hz
Frequency band 5: 2400-2880Hz
Frequency band 6: 2880-3360Hz
Ok what else. We could try segment the audio by band, so we can narrow in on the relevant frequency range, and then maybe just focus on that smaller range, again, in higher detail.
Learning features with some labeled data, is probably the correct way to do chicken stress vocalisation detections.
Unsupervised anomaly detection might be totally off, in terms of what an anomaly is. It is probably best, to zoom in on the relevant bands and to demonstrate a minimal example of what a stressed chicken sounds like, vs a chilled chicken, and compare the spectrograms to see if there’s a tell-tale visualisable feature.
A score from 1 to 5 for example, is going to be anomalous in arbitrary ways, without labelled data. Maybe the chickens are usually stressed, and the anomalies are when they are unstressed, for example.
A change in timing in music might be defined, in some way. like 4 out of 7 bands exhibiting anomalous amplitudes. But that probably won’t help for this. It’s probably just going to come down to a very narrow band of interest. Possibly pointing it out on a spectrogram that’s zoomed in on the feature, and then feeding the htm with an encoding of that narrow band of relevant data.
I’ll continue here, with some notes on filtering. After much fuss, the sox app (apt-get install sox) does it, sort of. Still working on python version.
$ sox 307_0_50.wav filtered_50_0.wav sinc -n 32767 0-480
$ sox 307_0_50.wav filtered_50_1.wav sinc -n 32767 480-960
$ sox 307_0_50.wav filtered_50_2.wav sinc -n 32767 960-1440
$ sox 307_0_50.wav filtered_50_3.wav sinc -n 32767 1440-1920
$ sox 307_0_50.wav filtered_50_4.wav sinc -n 32767 1920-2400
$ sox 307_0_50.wav filtered_50_5.wav sinc -n 32767 2400-2880
$ sox 307_0_50.wav filtered_50_6.wav sinc -n 32767 2880-3360
So, sox does seem to be working. The mel spectrogram is logarithmic, which is why it looks like this.
Visually, it looks like I'm interested in 2048 to 4096 Hz. That's where I can see the chirps.
Hmm. So I think the spectrogram is confusing everything.
So where does 4800 come from? 48 kHz. 48,000 Hz (48 kHz) is the sample rate “used for DVDs“.
Ah. Right. The spectrogram values represent buckets of 5 samples each, and the full range is to 24000…?
ok. So 2 x 24000. Maybe 2 channels? Anyway, full range is to 48000Hz. In that case, are the bands actually…
Frequency band 0: 0-4800Hz
Frequency band 1: 4800-9600Hz
Frequency band 2: 9600-14400Hz
Frequency band 3: 14400-19200Hz
Frequency band 4: 19200-24000Hz
Frequency band 5: 24000-28800Hz
Frequency band 6: 28800-33600Hz
Ok so no, it’s half the above because of the sample width of 2.
Frequency band 0: 0-2400Hz
Frequency band 1: 2400-4800Hz
Frequency band 2: 4800-7200Hz
Frequency band 3: 7200-9600Hz
Frequency band 4: 9600-12000Hz
Frequency band 5: 12000-14400Hz
Frequency band 6: 14400-16800Hz
So why is the spectrogram maxing at 8192Hz? Must be spectrogram sampling related.
So the original signal is 0 to 24000Hz, and the spectrogram must be 8192Hz because… the spectrogram is made some way. I’ll try get back to this when I understand it.
Ok i don’t entirely understand the last two. But basically the mel spectrogram is logarithmic, so those high frequencies really don’t get much love on the mel spectrogram graph. Buggy maybe.
So now I’m plotting the ‘chirp density’ (basically volume).
In this scheme, we just proxy chirp volume density as a variable representing stress. We don’t know if it is a true proxy. As you can see, some recordings have more variation than others.
Some heuristic could be decided upon, for rating the stress from 1 to 5. The heuristic depends on how the program would be used. For example, if it were streaming audio, for an alert system, it might alert upon some duration of time spent above one standard deviation from the rolling mean. I’m not sure how the program would be used though.
If the goal were to differentiate stressed and not stressed vocalisations, that would require labelled audio data.
Though I’m generally using stable baseline algorithms for training locomotion tasks, I am sometimes drawn back to evolutionary algorithms, and especially Map Elites, which has now been upgraded to incorporate a policy gradient.
The archiving of behaviours is what attracts me to Map Elites.
PGA Map Elites based on top of QDGym, which tracks Quality Diversity, is probably worth a look.
Beautiful. Since we’re currently just using a 256×256 view port in pybullet, this is quite a bit more advanced than required though. Learning game engines can also take a while. It took me about a month to learn Unity3d, with intermediate C# experience. Unreal Engine uses C++, so it’s a bit less accessible to beginners.
Continuing from our early notes on SLAM algorithms (Simultaneous Localisation and Mapping), and the similar but not as map-making, DSO algorithm, I came across a good project (“From cups to consciousness“) and article that reminded me that mapping the environment or at least having some sense of depth, will be pretty crucial.
At the moment I’ve just got to the point of thinking to train a CNN on simulation data, and so there should also be some positioning of the robot as a model in it’s own virtual world. So it’s probably best to reexamine what’s already out there. Visual odometry. Optical Flow.
I found a good paper summarizing 2019 options. The author’s github has some interesting scripts that might be useful. It reminds me that I should probably be using ROS and gazebo, to some extent. The conclusion was roughly that Google Cartographer or GMapping (Open SLAM) are generally beating some other ones, Karto, Hector. Seems like SLAM code is all a few years old. Google Cartographer had some support for ‘lifelong mapping‘, which sounded interesting. The robot goes around updating its map, a bit. It reminds me I saw ‘PonderNet‘ today, fresh from DeepMind, which from a quick look is, more or less, about like scaling your workload down to your input size.
Anyway, we are mostly interested in Monocular SLAM. So none of this applies, probably. I’m mostly interested at the moment, in using some prefab scenes like the AI2Thor environment in the Cups-RL example, and making some sort of SLAM in simulation.
Also interesting is RATSLAM and the recent update: LatentSLAM – The authors of this site, The Smart Robot, got my attention because of the CCNs. Cortical column networks.
“A common shortcoming of RatSLAM is its sensitivity to perceptual aliasing, in part due to the reliance on an engineered visual processing pipeline. We aim to reduce the effects of perceptual aliasing by replacing the perception module by a learned dynamics model. We create a generative model that is able to encode sensory observations into a latent code that can be used as a replacement to the visual input of the RatSLAM system”
Interesting, “The robot performed 1,143 delivery tasks to 11 different locations with only one delivery failure (from which it recovered), traveled a total distance of more than 40 km over 37 hours of active operation, and recharged autonomously a total of 23 times.“
I think DSO might be a good option, or the closed loop, LDSO, look like the most straight-forward, maybe.
After a weekend away with a computer vision professional, I found out about COLMAP, a structure from movement suite.
I saw a few more recent projects too, e.g. NeuralRecon, and
ooh, here’s a recent facebook one that sounds like it might work!
Consistent Depth … eh, their google colab is totally broken.
Anyhow, LDSO. Let’s try it.
In file included from /dmc/LDSO/include/internal/OptimizationBackend/AccumulatedTopHessian.h:10:0, from /dmc/LDSO/include/internal/OptimizationBackend/EnergyFunctional.h:9, from /dmc/LDSO/include/frontend/FeatureMatcher.h:10, from /dmc/LDSO/include/frontend/FullSystem.h:18, from /dmc/LDSO/src/Map.cc:4: /dmc/LDSO/include/internal/OptimizationBackend/MatrixAccumulators.h:8:10: fatal error: SSE2NEON.h: No such file or directory #include "SSE2NEON.h" ^~~~ compilation terminated. src/CMakeFiles/ldso.dir/build.make:182: recipe for target 'src/CMakeFiles/ldso.dir/Map.cc.o' failed make[2]: *** [src/CMakeFiles/ldso.dir/Map.cc.o] Error 1 make[2]: *** Waiting for unfinished jobs…. CMakeFiles/Makefile2:85: recipe for target 'src/CMakeFiles/ldso.dir/all' failed make[1]: *** [src/CMakeFiles/ldso.dir/all] Error 2 Makefile:83: recipe for target 'all' failed make: *** [all] Error 2
Ok maybe not.
There’s a paper here reviewing ORBSLAM3 and LDSO, and they encounter lots of issues. But it’s a good paper for an overview of how the algorithms work. We want a point cloud so we can find the closest points, and not walk into them.
Calibration is an issue, rolling shutter cameras are an issue, IMU data can’t be synced meaningfully, it’s a bit of a mess, really.
Also, reports that ORB-SLAM2 was only getting 5 fps on a raspberry pi, I got smart, and looked for something specifically for the jetson. I found a depth CNN for monocular vision on the forum, amazing.
Ok so after much fussing about, I found just what we need. I had an old copy of jetson-containers, and the slam code was added just 6 months ago. I might want to try the noetic one (ROS2) instead of ROS, good old ROS.
git clone https://github.com/dusty-nv/jetson-containers.git
cd jetson-containers
chicken@jetson:~/jetson-containers$ ./scripts/docker_build_ros.sh --distro melodic --with-slam
Successfully built 2eb4d9c158b0
Successfully tagged ros:melodic-ros-base-l4t-r32.5.0
chicken@jetson:~/jetson-containers$ ./scripts/docker_test_ros.sh melodic
reading L4T version from /etc/nv_tegra_release
L4T BSP Version: L4T R32.5.0
l4t-base image: nvcr.io/nvidia/l4t-base:r32.5.0
testing container ros:melodic-ros-base-l4t-r32.5.0 => ros_version
xhost: unable to open display ""
xauth: file /tmp/.docker.xauth does not exist
sourcing /opt/ros/melodic/setup.bash
ROS_ROOT /opt/ros/melodic/share/ros
ROS_DISTRO melodic
getting ROS version -
melodic
done testing container ros:melodic-ros-base-l4t-r32.5.0 => ros_version
Well other than the X display, looking good.
Maybe I should just plug in a monitor. Ideally I wouldn’t have to, though. I used GStreamer the other time. Maybe we do that again.
This looks good too… https://github.com/dusty-nv/ros_deep_learning but let’s stay focused. I’m also thinking maybe we upgrade early, to noetic. Ugh it looks like a whole new bunch of build tools and things to relearn. I’m sure it’s amazing. Let’s do ROS1, for now.
Let’s try build that FCNN one again.
CMake Error at tx2_fcnn_node/Thirdparty/fcrn-inference/CMakeLists.txt:121 (find_package):
By not providing "FindOpenCV.cmake" in CMAKE_MODULE_PATH this project has
asked CMake to find a package configuration file provided by "OpenCV", but
CMake did not find one.
Could not find a package configuration file provided by "OpenCV" (requested
version 3.0.0) with any of the following names:
OpenCVConfig.cmake
opencv-config.cmake
Add the installation prefix of "OpenCV" to CMAKE_PREFIX_PATH or set
"OpenCV_DIR" to a directory containing one of the above files. If "OpenCV"
provides a separate development package or SDK, be sure it has been
installed.
-- Configuring incomplete, errors occurred!
Ok hold on…
Builds additional container with VSLAM packages,
including ORBSLAM2, RTABMAP, ZED, and Realsense.
This only applies to foxy and galactic and implies
--with-pytorch as these containers use PyTorch."
Ok that hangs when it starts building the slam bits. Luckily, someone’s raised the bug, and though it’s not fixed, Dusty does have a docker already compiled.
So, after some digging, I think we can solve the X problem (i.e. where are we going to see this alleged SLAMming occur?) with an RTSP server. Previously I used GStreamer to send RTP over UDP. But this makes more sense, to run a server on the Jetson. There’s a plugin for GStreamer, so I’m trying to get the ‘dev’ version, so I can compile the test-launch.c program.
apt-get install libgstrtspserver-1.0-dev
Reading package lists... Done
Building dependency tree
Reading state information... Done
libgstrtspserver-1.0-dev is already the newest version (1.14.5-0ubuntu1~18.04.1).
ok... git clone https://github.com/GStreamer/gst-rtsp-server.git
root@jetson:/opt/gst-rtsp-server/examples# gcc test-launch.c -o test-launch $(pkg-config --cflags --libs gstreamer-1.0 gstreamer-rtsp-server-1.0)
test-launch.c: In function ‘main’:
test-launch.c:77:3: warning: implicit declaration of function ‘gst_rtsp_media_factory_set_enable_rtcp’; did you mean ‘gst_rtsp_media_factory_set_latency’? [-Wimplicit-function-declaration]
gst_rtsp_media_factory_set_enable_rtcp (factory, !disable_rtcp);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
gst_rtsp_media_factory_set_latency
/tmp/ccC1QgPA.o: In function `main':
test-launch.c:(.text+0x154): undefined reference to `gst_rtsp_media_factory_set_enable_rtcp'
collect2: error: ld returned 1 exit status
gst_rtsp_media_factory_set_enable_rtcp
Ok wait let’s reinstall gstreamer.
apt-get install libgstreamer1.0-dev libgstreamer-plugins-base1.0-dev libgstreamer-plugins-bad1.0-dev gstreamer1.0-plugins-base gstreamer1.0-plugins-good gstreamer1.0-plugins-bad gstreamer1.0-plugins-ugly gstreamer1.0-libav gstreamer1.0-doc gstreamer1.0-tools gstreamer1.0-x gstreamer1.0-alsa gstreamer1.0-gl gstreamer1.0-gtk3 gstreamer1.0-qt5 gstreamer1.0-pulseaudio
error...
Unpacking libgstreamer-plugins-bad1.0-dev:arm64 (1.14.5-0ubuntu1~18.04.1) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-Ec7eDq/62-libopencv-dev_3.2.0+dfsg-4ubuntu0.1_arm64.deb
E: Sub-process /usr/bin/dpkg returned an error code (1)
Ok then leave out that one...
apt --fix-broken install
and that fails on
Errors were encountered while processing:
/var/cache/apt/archives/libopencv-dev_3.2.0+dfsg-4ubuntu0.1_arm64.deb
It’s like a sign of being a good programmer, to solve this stuff. But damn. Every time. Suggestions continue, in the forums of those who came before. Let’s reload the docker.
Ok I took a break and got lucky. The test-launch.c code is different from what the admin had.
Let’s diff it and see what changed…
#define DEFAULT_DISABLE_RTCP FALSE
from
static gboolean disable_rtcp = DEFAULT_DISABLE_RTCP;
{"disable-rtcp", '\0', 0, G_OPTION_ARG_NONE, &disable_rtcp,
"Whether RTCP should be disabled (default false)", NULL},
from
gst_rtsp_media_factory_set_enable_rtcp (factory, !disable_rtcp);
so now this works (to compile).
gcc test.c -o test $(pkg-config --cflags --libs gstreamer-1.0 gstreamer-rtsp-server-1.0)
So apparently now I can run this in VLC… when I open
rtsp://<jetson-ip>:8554/test
Um is that meant to happen?…. Yes!
Ok next, we want to see SLAM stuff happening. So, ideally, a video feed of the desktop, or something like that.
So hereare the links I have open. Maybe I get back to them later. Need to get back to ORBSLAM2 first, and see where we’re at, and what we need. Not quite /dev/video0 to PC client. More like, ORBSLAM2 to dev/video0 to PC client. Or full screen desktop. One way or another.
libgstrtspserver-1.0-dev is already the newest version (1.14.5-0ubuntu1~18.04.1).
Today we have
E: Unable to locate package libgstrtspserver-1.0-dev E: Couldn't find any package by glob 'libgstrtspserver-1.0-dev' E: Couldn't find any package by regex 'libgstrtspserver-1.0-dev'
Did I maybe compile it outside of the docker? Hmm maybe. Why can’t I find it though? Let’s try the obvious… but also why does this take so long? Network is unreachable. Network is unreachable. Where have all the mirrors gone?
apt-get update
Ok so long story short, I made another docker file. to get gstreamer installed. It mostly required adding a key for the kitware apt repo.
Since 1.14, the use of libv4l2 has been disabled due to major bugs in the emulation layer. To enable usage of this library, set the environment variable GST_V4L2_USE_LIBV4L2=1
but it doesn’t want to work anyway. Ok RTSP is almost a dead end.
I might attach a CSI camera instead of V4L2 (USB camera) maybe. Seems less troublesome. But yeah let’s take a break. Let’s get back to depthnet and ROS2 and ORB-SLAM2, etc.
depthnet: error while loading shared libraries: /usr/lib/aarch64-linux-gnu/libnvinfer.so.8: file too short
Ok, let’s try ROS2.
(Sorry, this was supposed to be about SLAM, right?)
As a follow-up for this post…
I asked about mapping two argus (NVIDIA’s CSI camera driver) node topics, in order to fool their stereo_proc, on the github issues. No replies, cause they probably want to sell expensive stereo cameras, and I am asking how to do it with $15 Chinese cameras.
I looked at DustyNV’s Mono depth. Probably not going to work. It seems like you can get a good depth estimate for things in the scene, but everything around the edges reads as ‘close’. Not sure that’s practical enough for depth.
I looked at the NVIDIA DNN depth. Needs proper stereo cameras.
I looked at NVIDIA VPI Stereo Disparity pipeline It is the most promising yet, but the input either needs to come from calibrated cameras, or needs to be rectified on-the-fly using OpenCV. This seems like it might be possible in python, but it is not obvious yet how to do it in C++, which the rest of the code is in.
I tried calibration.
I removed the USB cameras.
I attached two RPi 2.1 CSI cameras, from older projects. Deep dived into ISAAC_ROS suite. Left ROS2 alone for a bit because it is just getting in the way. The one camera sensor had fuzzy lines going across, horizontally, occasionally, and calibration results were poor, fuzzy. Decided I needed new cameras.
IMX-219 was used by the github author, and I even printed out half of the holder, to hold the cameras 8cm apart.
I tried calibration using the ROS2 cameracalibrator, which is a wrapper for a opencv call, after starting up the camera driver node, inside the isaac ros docker.
(Because of bug, also sometimes need to remove –ros-args –remap )
OpenCV was able to calibrate, via the ROS2 application, in both cases. So maybe I should just grab the outputs from that. We’ll do that again, now. But I think I need to print out a chessboard and just see how that goes first.
I couldn’t get more than a couple of matches using pictures of the chessboard on the screen, even with binary thresholding, in the author’s calibration notebooks.
Here’s what the NVIDIA VPI 1.2’s samples drew, for my chess boards:
Camera calibration seems to be a serious problem, in the IOT camera world. I want something approximating depth, and it is turning out that there’s some math involved.
Learning about epipolar geometry was not something I planned to do for this.
But this is like a major showstopper, so either, I must rectify, in real time, or I must calibrate.
“The reason for the noisy result is that the VPI algorithm expects the rectified image pairs as input. Please do the rectification first and then feed the rectified images into the stereo disparity estimator.”
So can we use this info? The nvidia post references this code below as the solution, perhaps, within the context of the code below. Let’s run it on the chessboard?