We’ve gone a totally different way, but this is another interesting project from Erwin Coumans, on the Google Brain team, who did PyBullet. NeuralSim replaces parts of physics engines with neural networks.
I just found this github from ETH Z. Not surprising that they have some of the most relevant datasets I’ve seen, pertaining to making proprioceptive autonomous systems. I came across their Autonomous Systems Labs dataset site.
One of the projects, panoptic mapping, is pretty much the panoptic segmentation from earlier research, combined with volumetric point clouds. “A flexible submap-based framework towards spatio-temporally consistent volumetric mapping and scene understanding.”
We’ve spoken with Dr. Maksimiljan Brus, at the University of Maribor, and he’s sent us some WAV file recordings of a large group of chickens.
There seems to be a decent amount of work done, particularly at Georgia Tech, regarding categorizing chicken sounds, to detect stress, or bronchitis, etc. They’ve also done some experiments to see how chickens react to humans and robots. (It takes them about 3 weeks to get used to either).
In researching the topic, there was a useful South African document related to smallholding size chicken businesses. It covers everything. Very good resource, actually, and puts into perspective the relative poverty in the communities where people sell chickens for a living. The profit margin per chicken in 2013 was about R12 per live chicken (less than 1 euro).
So anyway, I’m having a look at the sound files, to see what data and features I can extract. There’s no labels, so there won’t be any reinforcement learning here. Anomaly detection doesn’t need labels, and can use moving window statistics, to notice when something is out of the ordinary. So that’s what I’m looking into.
I am personally interested in Numenta’s algorithms, such as HTM, which use a model of cortical columns, and sparse encodings, to predict, and detect anomalies. I looked into getting Nupic.critic working, but Nupic is so old now, written in Python 2, that it’s practically impossible to get working. There is a community fork, htm.core, updated to Python 3, but it’s missing parts of the nupic codebase that nupic.critic is relying on. I’m able to convert the sound files to the nupic format, but am stuck for now, when running the analysis.
So let’s start at a more basic level and work our way up.
I downloaded Praat, an interesting sound analysis program used for some audio research. Not sure if it’s useful here. But it’s able to show various sound features. I’ll close it again, for now.
So, first thing to do, is going to be Mel spectrograms, and possibly Mel Frequency Cepstral Coefficients (MFCCs). The Mel scale kinda allows a difference between 250Hz and 500Hz to be scaled to the same size as a difference between 13250Hz and 13500Hz. It’s log-scaled.
Mel spectrograms let you use visual tools on audio. Also, worth knowing what a feature is, in machine learning. It’s a measurable property.
Ok where to start? Maybe librosa and PyOD?
pip install librosa
Ok and this outlier detection medium writeup, PyOD, says
Neural Networks
Neural networks can also be trained to identify anomalies.
Autoencoder (and variational autoencoder) network architectures can be trained to identify anomalies without labeled instances. Autoencoders learn to compress and reconstruct the information in data. Reconstruction errors are then used as anomaly scores.
More recently, several GAN architectures have been proposed for anomaly detection (e.g. MO_GAAL).
There’s also the results of a group working on this sort of problem, here.
Here was an illustrative example of an anomaly, of some machine sound.
And of course, there are more traditional? algorithms, (data-science algorithms). Here’s a medium article overview, for a submission to a heart murmur challenge. It mentions kapre, “Keras Audio Preprocessors – compute STFT, ISTFT, Melspectrogram, and others on GPU real-time.”
Here’s a useful flowchart from a paper about edge sound analysis on a Teensy. Smart Audio Sensors (SASs). The code “computes the FFT and Mel coefficients of a recorded audio frame.”
I haven’t mentioned it, but of course FFT, Fast Fourier Transform, which converts audio to frequency bands, is going to be a useful tool, too. “The FFT’s importance derives from the fact that it has made working in the frequency domain equally computationally feasible as working in the temporal or spatial domain. ” – (wikipedia)
On the synthesis and possibly artistic end, there’s also MelGAN and the like.
Google’s got pipelines in kubernetes ? MLOps stuff.
Artistically speaking, it sounds like we want spectrograms. Someone implements one from scratch here, and there is a link to a good youtube video on relevant sound analysis ideas. Wide-band, vs. narrow-band, for example. Overlapping windows? They’re explaining STFT, which is used to make spectrograms.
Anyway. Good stuff. As always, I find the hardest part is finding your way back to your various dev environments. Ok I logged into the Jupyter running in the docker on the Jetson. ifconfig to get the ip, and http://192.168.101.115:8888/lab, voila).
Ok let’s see torchaudio’s colab… and pip install, ok… Here’s a summary of the colab.
Some ghostly Mel spectrogram stuff. Also, interesting ‘To recover a waveform from spectrogram, you can use GriffinLim.’
Ok let’s get our own dataset prepared. We need an anomaly detector. Let’s see…
———————— <LIBROSA INSTALLATION…> —————
Ok the librosa mel spectrogram is working, at least, so far. So these are the images for the 4 files Dr. Brus sent.
While looking for something like STFT to make a spectogram video, i came across this resource: Machine Hearing. Also this tome of ML resources.
Classification is maybe the best way to do this ML stuff. Then you can add labels to classes, and train a neural network to associate labels, and to categorise. So it would be ideal, if the data were pre-labelled, i.e. classified by chicken stress vocalisation experts. Like here is a soundset with metadata, that lets you classify sounds with labels, (with training).
So we really do need to use an anomaly detection algorithm, because I listened to the chickens for a little bit, and I’m not getting the nuances.
Here’s a relevant paper, which learns classes, for retroactive labelling. They’re recording a machine making sounds, and then humans label it. They say 1NN (k-nearest-neighbours) is hard to beat, but it’s memory intensive. “Nearest centroid (NC) combined with DBA has been shown to be competitive with kNN at a much smaller computational cost”.
Ok, let’s hope this old link works, for a nupic docker.
sudo docker run -i -t numenta/nupic /bin/bash
Ok amazing. Ok right, trying to install matplotlib inside the docker crashes. urllib3. I’ve been here before. Right, I asked on the github issues. 14 days ago, I asked. I got htm.core working. But it doesn’t have nupic.data classes.
After bashing my head against the apparent impossibility to pip install urllib3 and matplotlib in a python 2.7 docker, I’ve decided I will have to port the older nupic.critic or nupic.audio code to htm.core.
I cleared up some harddrive space, and ran this docker:
docker run -d -p 8888:8888 --name jupyter 3rdman/htm.core-jupyter:latest
then get the token for the URL:
docker logs -f jupyter
There’s a lot to go through, and I’m a noob at HTM. So I will start a new article now, on HTM specifically, for this.
Interesting as a guideline for comparison with international efforts, and for perspective of the sort of money in this problem. “the industry could save between $1.5 billion and $2.5 billion each year.” – News Article.
The Egg-Tech Prize Phase II criteria forms the basis for the merit-based review, outlined above.
Day and potential to utilize male eggs (up to 25 points).
Minimum: Functions on or before day 8 of incubation. Preference for solutions with reduced incubation time with pre-incubation most preferred. Protocols involving short periods of incubation during egg storage (SPIODES) will be considered preincubation and given preference. Preference will be given to technologies that enable the use of male eggs in other applications.
Accuracy (up to 20 points).
Minimum: 98 percent accuracy. Preference will be given to technologies that work with all chicken breeds/colors commonly used in commercial production.
Economic Feasibility (up to 20 points).
Score for this criterion will consider economic feasibility based on a cost-benefit analysis and business plan that should include:
Direct costs:
Capital costs incurred by technology developer, per hatchery
Capital investment for equipment/structure modification by hatchery
Predicted annual maintenance costs
Predicted annual consumables costs
Predicted personnel training and labor requirements (hours)
Indirect costs:
Expected utilities requirements of technology
Potential revenue models
Lease, subscription, sales, other.
Other revenue streams for developer
Predicted revenues gained for hatchery in diverting eggs, energy savings, labor, cost-savings from not feeding male chicks (depending on country), etc.
Throughput and physical size (up to 15 points)
Potential for sexing at least 15,000 eggs per hour (more preferred). If multiple units will be used in combination to achieve the desired throughput, only one demonstration unit will be required but all units needed to meet the desired throughput must fit into existing hatchery structures, with reasonable and appropriate modifications.
Hatchability (up to 15 points)
Minimum: Does not reduce hatching rate by more than 1.5 percent from baseline.
Speed of test results (up to 5 points)
Results returned in less than 30 min if eggs are tested during incubation (allowable time for removal, testing and return to incubator).† If eggs are tested prior to incubation, with or without SPIDES, results must be available within 48 hours of testing. Accurate tracking and identification of eggs must be demonstrated.
†Longer times until test result will require placing eggs back into the incubator, in which case they must be removed again for sorting.
AdelaiDet is an open source toolbox for multiple instance-level recognition tasks on top of Detectron2. All instance-level recognition works from our group are open-sourced here.
To date, AdelaiDet implements the following algorithms:
One of the involved mini-companies released an open source tool for tagging and organising pics for ML training purposes: https://voxel51.com/fiftyone/ (user guide: https://voxel51.com/docs/fiftyone/user_guide/index.html) The company has something to do with UofMichigan and it’s been used for “tracking social distancing behaviors” (for covid19)
pip install pyyaml==5.1
Successfully built pyyaml
Installing collected packages: pyyaml
Attempting uninstall: pyyaml
Found existing installation: PyYAML 3.12
ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
pip3 install --ignore-installed PyYAML
Successfully installed PyYAML-5.1
Next error...
ModuleNotFoundError: No module named 'torchvision'
pip install torchvision
Next error...
AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx
ok
python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu
[08/17 20:53:11 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=None, opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl', 'MODEL.DEVICE', 'cpu'], output=None, video_input=None, webcam=True)
[08/17 20:53:12 fvcore.common.checkpoint]: Loading checkpoint from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:53:12 fvcore.common.file_io]: Downloading https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
[08/17 20:53:12 fvcore.common.download]: Downloading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
model_final_f10217.pkl: 178MB [01:26, 2.05MB/s]
[08/17 20:54:39 fvcore.common.download]: Successfully downloaded /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl. 177841981 bytes.
[08/17 20:54:39 fvcore.common.file_io]: URL https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl cached in /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:54:39 fvcore.common.checkpoint]: Reading a file from 'Detectron2 Model Zoo'
0it [00:00, ?it/s]/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
0it [00:06, ?it/s]
Traceback (most recent call last):
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'
Ok...
pip install opencv-python
Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (4.2.0.34)
Looks like 4.3.0 vs 4.2.0.34 kinda thing
sudo apt-get install libopencv-*
nope...
/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
def nonzero_tuple(x):
"""
A 'as_tuple=True' version of torch.nonzero to support torchscript.
because of https://github.com/pytorch/pytorch/issues/38718
"""
if x.dim() == 0:
return x.unsqueeze(0).nonzero().unbind(1)
return x.nonzero(as_tuple=True).unbind(1)
AttributeError: 'tuple' object has no attribute 'unbind'
https://github.com/pytorch/pytorch/issues/38718
FFS. Why does nothing ever fucking work ?
pytorch 1.6:
"putting 1.6.0 milestone for now; this isn't the worst, but it's a pretty bad user experience."
Yeah no shit.
let's try...
return x.nonzero(as_tuple=False).unbind(1)
Ok next error same
/opt/detectron2/detectron2/modeling/roi_heads/fast_rcnn.py:111
Ok... back to this error (after adding as_tuple=False twice)
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'
Decided to check if maybe this is a conda vs pip thing. Like maybe I just need to install the conda version instead?
But it looks like a GTK+ 2.x isn’t installed. Seems I installed it using pip, i.e. pip install opencv-contrib-python and that isn’t built with gtk+2.x. I can also use qt as the graphical interface.
“GTK supposedly uses more memory because GTK provides more functionality. Qt does less and uses less memory. If that is your logic, then you should also look at Aura and the many other user interface libraries providing less functionality.” (link )