Finally tried shining light through an egg, and discovered calcium deposits making things opaque. Hmm.
This led to finding out about all these abnormal eggs.
Finally tried shining light through an egg, and discovered calcium deposits making things opaque. Hmm.
This led to finding out about all these abnormal eggs.
Here I’m continuing with the task of unsupervised detection of audio anomalies, hopefully for the purpose of detecting chicken stress vocalisations.
After much fussing around with the old Numenta NuPic codebase, I’m porting the older nupic.audio and nupic.critic code, over to the more recent htm.core.
These are the main parts:
I’ve come across a very intricate implementation and documentation, about understanding the important parts in the HTM model, way deep, like how did I get here? I will try implement the ‘critic’ code, first. Or rather, I’ll try port it from nupic to htm. After further investigation, there’s a few options, and I’m going to try edit the hotgym example, and try shove wav files frequency band scalars through it instead of power consumption data. I’m simplifying the investigation. But I need to make some progress.
I’m using this docker to get in, mapping my code and wav file folder in:
docker run -d -p 8888:8888 --name jupyter -v /media/chrx/0FEC49A4317DA4DA/sounds/:/home/jovyan/work 3rdman/htm.core-jupyter:latest So I've got some code working that writes to 'nupic format' (.csv) and code that reads the amplitudes from the csv file, and then runs it through htm.core. So it takes a while, and it's just for 1 band (of 10 bands). I see it also uses the first 1/4 of so of the time to know what it's dealing with. Probably need to run it through twice to get predictive results in the first 1/4. Ok no, after a few weeks, I've come back to this point, and realise that the top graph is the important one. Prediction is what's important. The bottom graphs are the anomaly scores, used by the prediction.
The idea in nupic.critic, was to threshold changes in X bands. Let’s see the other graphs…
Ok Frequency bands 7, 8, 9 were all zero amplitude. So that’s the highest the frequencies went. Just gotta check what those frequencies are, again…
Opening 307.wav Sample width (bytes): 2 Frame rate (sampling frequency): 48000 Number of frames: 20771840 Signal length: 20771840 Seconds: 432 Dimensions of periodogram: 4801 x 2163 Ok with 10 buckets, 4801 would divide into Frequency band 0: 0-480Hz Frequency band 1: 480-960Hz Frequency band 2: 960-1440Hz Frequency band 3: 1440-1920Hz Frequency band 4: 1920-2400Hz Frequency band 5: 2400-2880Hz Frequency band 6: 2880-3360Hz
Ok what else. We could try segment the audio by band, so we can narrow in on the relevant frequency range, and then maybe just focus on that smaller range, again, in higher detail.
Learning features with some labeled data, is probably the correct way to do chicken stress vocalisation detections.
Unsupervised anomaly detection might be totally off, in terms of what an anomaly is. It is probably best, to zoom in on the relevant bands and to demonstrate a minimal example of what a stressed chicken sounds like, vs a chilled chicken, and compare the spectrograms to see if there’s a tell-tale visualisable feature.
A score from 1 to 5 for example, is going to be anomalous in arbitrary ways, without labelled data. Maybe the chickens are usually stressed, and the anomalies are when they are unstressed, for example.
A change in timing in music might be defined, in some way. like 4 out of 7 bands exhibiting anomalous amplitudes. But that probably won’t help for this. It’s probably just going to come down to a very narrow band of interest. Possibly pointing it out on a spectrogram that’s zoomed in on the feature, and then feeding the htm with an encoding of that narrow band of relevant data.
I’ll continue here, with some notes on filtering. After much fuss, the sox app (apt-get install sox
) does it, sort of. Still working on python version.
$ sox 307_0_50.wav filtered_50_0.wav sinc -n 32767 0-480 $ sox 307_0_50.wav filtered_50_1.wav sinc -n 32767 480-960 $ sox 307_0_50.wav filtered_50_2.wav sinc -n 32767 960-1440 $ sox 307_0_50.wav filtered_50_3.wav sinc -n 32767 1440-1920 $ sox 307_0_50.wav filtered_50_4.wav sinc -n 32767 1920-2400 $ sox 307_0_50.wav filtered_50_5.wav sinc -n 32767 2400-2880 $ sox 307_0_50.wav filtered_50_6.wav sinc -n 32767 2880-3360 So, sox does seem to be working. The mel spectrogram is logarithmic, which is why it looks like this. Visually, it looks like I'm interested in 2048 to 4096 Hz. That's where I can see the chirps.
Hmm. So I think the spectrogram is confusing everything.
So where does 4800 come from? 48 kHz. 48,000 Hz (48 kHz) is the sample rate “used for DVDs“.
Ah. Right. The spectrogram values represent buckets of 5 samples each, and the full range is to 24000…?
Sample width (bytes): 2
0. 5. 10. 15. 20. 25. 30. 35. 40. 45. 50. 55. 60. 65. 70. 75. 80. 85. 90. 95. 100. 105. 110. 115. 120. 125. 130. 135. 140. 145. ... 23950. 23955. 23960. 23965. 23970. 23975. 23980. 23985. 23990. 23995. 24000.]
ok. So 2 x 24000. Maybe 2 channels? Anyway, full range is to 48000Hz. In that case, are the bands actually…
Frequency band 0: 0-4800Hz Frequency band 1: 4800-9600Hz Frequency band 2: 9600-14400Hz Frequency band 3: 14400-19200Hz Frequency band 4: 19200-24000Hz Frequency band 5: 24000-28800Hz Frequency band 6: 28800-33600Hz
Ok so no, it’s half the above because of the sample width of 2.
Frequency band 0: 0-2400Hz Frequency band 1: 2400-4800Hz Frequency band 2: 4800-7200Hz Frequency band 3: 7200-9600Hz Frequency band 4: 9600-12000Hz Frequency band 5: 12000-14400Hz Frequency band 6: 14400-16800Hz
So why is the spectrogram maxing at 8192Hz? Must be spectrogram sampling related.
So the original signal is 0 to 24000Hz, and the spectrogram must be 8192Hz because… the spectrogram is made some way. I’ll try get back to this when I understand it.
sox 307_0_50.wav filtered_50_0.wav sinc -n 32767 0-2400 sox 307_0_50.wav filtered_50_1.wav sinc -n 32767 2400-4800 sox 307_0_50.wav filtered_50_2.wav sinc -n 32767 4800-7200 sox 307_0_50.wav filtered_50_3.wav sinc -n 32767 7200-9600 sox 307_0_50.wav filtered_50_4.wav sinc -n 32767 9600-12000 sox 307_0_50.wav filtered_50_5.wav sinc -n 32767 12000-14400 sox 307_0_50.wav filtered_50_6.wav sinc -n 32767 14400-16800 Ok I get it now.
Ok i don’t entirely understand the last two. But basically the mel spectrogram is logarithmic, so those high frequencies really don’t get much love on the mel spectrogram graph. Buggy maybe.
But I can estimate now the chirp frequencies…
sox 307_0_50.wav filtered_bird.wav sinc -n 32767 1800-5200
Beautiful. So, now to ‘extract the features’…
So, the nupic.critic code with 1 bucket managed to get something resembling the spectrogram. Ignore the blue.
But it looks like maybe, we can even just threshold and count peaks. That might be it.
sox 307.wav filtered_307.wav sinc -n 32767 1800-5200 sox 3072.wav filtered_3072.wav sinc -n 32767 1800-5200 sox 237.wav filtered_237.wav sinc -n 32767 1800-5200 sox 98.wav filtered_98.wav sinc -n 32767 1800-5200
Let’s do the big files…
Ok looks good enough.
So now I’m plotting the ‘chirp density’ (basically volume).
In this scheme, we just proxy chirp volume density as a variable representing stress. We don’t know if it is a true proxy.
As you can see, some recordings have more variation than others.
Some heuristic could be decided upon, for rating the stress from 1 to 5. The heuristic depends on how the program would be used. For example, if it were streaming audio, for an alert system, it might alert upon some duration of time spent above one standard deviation from the rolling mean. I’m not sure how the program would be used though.
If the goal were to differentiate stressed and not stressed vocalisations, that would require labelledĀ audio data. Ā Ā
(Also, basically didn’t end up using HTM, lol)
I thought it’s probably worth noting some sci-fi ideas that I remember…
The Ameglian Major Cow (or, Dish of the Day), was the cow in Hitchhiker’s guide book 2, (Restaurant at the end of the Universe), that has been bred to want to be eaten, and when Zaphod Beeblebrox, etc., order steak, the cow goes off to shoot itself, and tells the protagonist, Arthur not to worry: “I’ll be very humane.”
The other thing was chairdogs, that show up in one of the later Dune books. The Tleilaxu are known for taking genetic engineering a bit too far, and one of their exports is a dog that’s been bred until it has become a chair, which massages you. Real ‘creature comforts’. It’s generally used for world building and character development, maybe insinuating that the characters with guilty-pleasure chairdogs, are getting soft.
Interesting, because the artificial selection or genetic engineering leading to these creations is questionable, but the final product is something that inverts or sidesteps morality.
The Ameglian Major Cow is a common thought experiment, as it postulates a question to vegetarians, whether they would eat meat, if it were intelligent, and wanted to be eaten. It’s very hypothetical though.
Chairdogs are closer: if chickens of tomorrow, or other domesticated animals, are ultimately evolved or engineered into simplified protein vats, their ‘souls’ (i.e. CNS) removed, perhaps we’re left with something less problematic, despite the apparent abominating of nature.
Side note: Alex O’Connor, Cosmic Skeptic, has a lot to say, on the philosophy of ethical veganism. Here he answers a question: “Do Animals have a “Right to Life?” – (tl;dw: no. but you should eat less meat)
We’ve got an egg in the gym environment now, so we need to collect some data for training the robot to go pick up an egg.
I’m going to have it save the rgba, depth and segmentation images to disk for Unet training. I left out the depth image for now. The pictures don’t look useful. But some papers are using the depth, so I might reconsider. Some weed bot paper uses 14-channel images with all sorts of extra domain specific data relevant to plants.
I wrote some code to take pics if the egg was in the viewport, and it took 1000 rgb and segmentation pictures or so. I need to change the colour of the egg for sure, and probably randomize all the textures a bit. But main thing is probably to make the segmentation layers with pixel colours 0,1,2, etc. so that it detects the egg and not so much the link in the foreground.
So sigmoid to softmax and so on. Switching to multi-class also begs the question whether to switch to Pytorch & COCO panoptic segmentation based training. It will have to happen eventually, as I think all of the fastest implementations are currently in Pytorch and COCO based. Keras might work fine for multiclass or multiple binary classification, but it’s sort of the beginning attempt. Something that works. More proof of concept than final implementation. But I think Keras will be good enough for these in-simulation 256×256 images.
Regarding multi-class segmentation, karolzak says “it’s just a matter of changing num_classes
argument and you would need to shape your mask in a different way (layer per class??), so for multiclass segmentation you would need a mask of shape (width, height, num_classes)
“
I’ll keep logging my debugging though, if you’re reading this.
So I ran segmask_linkindex.py to see what it does, and how to get more useful data. The code is not running because the segmentation image actually has an array of arrays. I presume it’s a numpy array. I think it must be the rows and columns. So anyway I added a second layer to the loop, and output the pixel values, and when I ran it in the one mode:
-1 -1 -1 83886081 obUid= 1 linkIndex= 4 83886081 obUid= 1 linkIndex= 4 1 obUid= 1 linkIndex= -1 1 obUid= 1 linkIndex= -1 16777217 obUid= 1 linkIndex= 0 16777217 obUid= 1 linkIndex= 0 -1 -1 -1 And in the other mode -1 -1 -1 1 obUid= 1 linkIndex= -1 1 obUid= 1 linkIndex= -1 1 obUid= 1 linkIndex= -1 -1 -1 -1
Ok I see. Hmm. Well the important thing is that this code is indeed for extracting the pixel information. I think it’s going to be best for the segmentation to use the simpler segmentation mask that doesn’t track the link info. Ok so I used that code from the guy’s thesis project, and that was interpolating the numbers. When I look at the unique elements of the mask without interpolation, I’ve got…
[ 0 2 255] [ 0 2 255] [ 0 2 255] [ 0 2 255] [ 0 2 255] [ 0 1 2 255] [ 0 1 2 255] [ 0 2 255] [ 0 2 255] Ok, so I think: 255 is the sky 0 is the plane 2 is the robotable 1 is the egg
So yeah, I was just confused because the segmentation masks were all black and white. But if you look closely with a pixel picker tool, the pixel values are (0,0,0), (1,1,1), (2,2,2), (255,255,255), so I just couldn’t see it.
The interpolation kinda helps, to be honest.
As per OpenAI’s domain randomization helping with Sim2Real, we want to randomize some textures and some other things like that. I also want to throw in some random chickens. Maybe some cats and dogs. I’m afraid of transfer learning, at this stage, because a lot of it has to do with changing the structure of the final layer of the neural network, and that might be tough. Let’s just do chickens and eggs.
An excerpt from OpenAI:
Both techniques increase the computational requirements: dynamics randomization slows training down by a factor of 3x, while learning from images rather than states is about 5-10x slower.
Ok that’s a bit more complex than I was thinking. I want to randomize textures and colours, first
I’ve downloaded and unzipped the ‘Describable Textures Dataset’
And ok it’s loading a random texture for the plane
and random colour for the egg and chicken
Ok, next thing is the Simulation CNN.
Interpolation doesn’t work though, for this, cause it interpolates from what’s available in the image:
[ 0 85 170 255] [ 0 63 127 191 255] [ 0 63 127 191 255] I kind of need the basic UID segmentation. [ 0 1 2 3 255] Ok, pity about the mask colours, but anyway.
Let’s train the UNet on the new dataset.
We’ll need to make karolzak’s changes.
I’ve saved 2000+ rgb.jpg and seg.png files and we’ve got [0,1,2,3,255] [plane, egg, robot, chicken, sky]
So num_classes=5
And
“for multiclass segmentation you would need a mask of shape (width, height, num_classes) “
What is y.shape?
(2001, 256, 256, 1)
which is 2001 files, of 256 x 256 pixels, and one class. So if I change that to 5…? ValueError: cannot reshape array of size 131137536 into shape (2001,256,256,5)
Um… Ok I need to do more research. Brb.
So the keras_unet library is set up to input binary masks per class, and output binary masks per class.
I would rather use the ‘integer’ class output, and have it output a single array, with the class id per pixel. Similar to this question. In preparation for karolzak probably not knowing how to do this with his library, I’ve asked on stackoverflow for an elegant way to make the binary masks from a multi-class mask, in the meantime.
I coded it up using the library author’s suggested method, as he pointed out that the gains of the integer encoding method are minimal. I’ll check it out another time. I think it might still make sense for certain cases.
Ok that’s pretty awesome. We have 4 masks. Human, chicken, egg, robot. I left out plane and sky for now. That was just 2000 images of training, and I have 20000. I trained on another 2000 images, and it’s down to 0.008 validation loss, which is good enough!
So now I want to load the CNN model in the locomotion code, and feed it the images from the camera, and then have a reward function related to maximizing the egg pixels.
I also need to look at the pybullet-planning project and see what it consists of, as I imagine they’ve made some progress on the next steps. “built-in implementations of standard motion planners, including PRM, RRT, biRRT, A* etc.” – I haven’t even come across these acronyms yet! Ok, they are motion planning. Solvers of some sort. Hmm.
Distributed Evolutionary Algorithms in Python
An oldie but a goodie. Was thinking to just implement the locomotion using good old genetic programming.
You could probably generate a walking robot using a genetic algorithm that repeated the tree’s actions a number of times. Bet it would be faster than the whole neural network policy training replay buffer spiel.
https://deap.readthedocs.io/en/master/tutorials/advanced/gp.html
another phd collab thing, on European Research Council grant https://www.resibots.eu/videos.html 2015-2020. Nice. They’re the ones who developed MAP-elites https://arxiv.org/abs/1504.04909
They https://members.loria.fr/JBMouret/nature_press.html had a paper published in Nature, for their bots that fix themselves.
MAP-Elites is interesting. It categorises behaviours and tests local optima, of some sort variables. Haven’t read the paper yet. It is windy.
“It creates a map of high-performing solutions at each point in a space defined by dimensions of variation that a user gets to choose. This Multi-dimensional Archive of Phenotypic Elites (MAP-Elites) algorithm illuminates search spaces, allowing researchers to understand how interesting attributes of solutions combine to affect performance, either positively or, equally of interest, negatively. “
The TD methods, were further compared to variants of NEAT and HyperNEAT (that is, OS, ONS, NS, OGN and GNS) with and without behavior transfer. The results demonstrated that the ONS variant of HyperNEAT performs much better (with respect to effectiveness and efficiency) than both TD methods and all variants of NEAT. Specific
evolutionary search methods to direct NE such as behavior diversity maintenance and the hybrid approach, work most effectively at balancing exploration versus exploitation in the search space, more so than TD methods.
Evolutionary search approaches investigated were objective-based search (OS), novelty search (NS), genotypic diversity search (GNS), hybrid of objective and novelty search and hybrid of objective based and genotypic diversity maintenance search (ONS and OGN, respectively). In this thesis, three methodological features were explored to ascertain an appropriate combination that enables the evolution of high quality solutions based on effectiveness (task performance) and efficiency (speed of adaptation) of evolved behaviors. These features are as follows: First, direct versus indirect encoding neuro-evolution methods for collective behavior evolution (that is, NEAT and HyperNEAT, respectively). Second, non-objective evolutionary search versus objective based search approach for guiding collective behavior evolution. Third, neuro-evolution with collective behavior transfer.
“with behavior transfer” referring more or less to crossover/mutate rather than starting from scratch with new individuals. Something like that.
https://www.food4rhino.com/app/octopus#lg=1&slide=7
and this bonkers shape searcher software https://youtu.be/SlyXJEO76BI https://www.food4rhino.com/app/octopus
https://www.mitpressjournals.org/doi/full/10.1162/ARTL_a_00210 this was such a good find, but the rest of their site wasn’t cooperating
http://cognet.mit.edu/journals/evolutionary-computation/28/1 it would be cool if i could view these pdfs. SA IP range banned š
https://github.com/PacktPublishing/Hands-on-Neuroevolution-with-Python.git
Copy pasta:
Hunter HeidenreichFollowJan 10, 2019 Ā· 9 min read
https://towardsdatascience.com/hyperneat-powerful-indirect-neural-network-evolution-fba5c7c43b7b
Last week, I wrote an article about NEAT (NeuroEvolution of Augmenting Topologies) and we discussed a lot of the cool things that surrounded the algorithm. We also briefly touched upon how this older algorithm might even impact how we approach network building today, alluding to the fact that neural networks need not be built entirely by hand.
Today, we are going to dive into a different approach to neuroevolution, an extension of NEAT called HyperNEAT. NEAT, as you might remember, had a direct encoding for its network structure. This was so that networks could be more intuitively evolved, node by node and connection by connection. HyperNEAT drops this idea because in order to evolve a network like the brain (with billions of neurons), one would need a much faster way of evolving that structure.
HyperNEAT is a much more conceptually complex algorithm (in my opinion, at least) and even I am working on understanding the nuts and bolts of how it all works. Today, we will take a look under the hood and explore some of the components of this algorithm so that we might better understand what makes it so powerful and reason about future extensions in this age of deep learning.
Before diving into the paper and algorithm, I think itās worth exploring a bit more the motivation behind HyperNEAT.
The full name of the paper is āA Hypercube-Based Indirect Encoding for Evolving Large-Scale Neural Networksā, which is quite the mouthful! But already, we can see two of the major points. Itās a hypercube-based indirect encoding. Weāll get into the hypercube part later, but already we know that itās a move from direct encodings to indirect encodings (see my last blog on NEAT for a more detailed description of some of the differences between the two). Furthermore, we get the major reasoning behind it as well: For evolving big neural nets!
More than that, the creators of this algorithm highlight that if one were to look at the brain, they see a ānetworkā with billions of nodes and trillions of connections. They see a network that uses repetition of structure, reusing a mapping of the same gene to generate the same physical structure multiple times. They also highlight that the human brain is constructed in a way so as to exploit physical properties of the world: symmetry (have mirrors of structures, two eyes for input for example) and locality (where nodes are in the structure influences their connections and functions).
Contrast this what we know about neural networks, either built through an evolution procedure or constructed by hand and trained. Do any of these properties hold? Sure, if we force the network to have symmetry and locality, maybeā¦ However, even then, take a dense, feed-forward network where all nodes in one layer are connected to all nodes in the next! And when looking at the networks constructed by the vanilla NEAT algorithm? They tend to be disorganized, sporadic, and not exhibit any of these nice regularities.
Enter in HyperNEAT! Utilizing an indirect encoding through something called connective Compositional Pattern Producing Networks (CPPNs), HyperNEAT attempts to exploit geometric properties to produce very large neural networks with these nice features that we might like to see in our evolved networks.
In the previous post, we discussed encodings and today weāll dive deeper into the indirect encoding used for HyperNEAT. Now, indirect encodings are a lot more common than you might think. In fact, you have one inside yourself!
DNA is an indirect encoding because the phenotypic results (what we actually see) are orders of magnitude larger than the genotypic content (the genes in the DNA). If you look at a human genome, weāll say it has about 30,000 genes coding for approximately 3 billion amino acids. Well, the brain has 3 trillion connections. Obviously, there is something indirect going on here!
Something borrowed from the ideas of biology is an encoding scheme called developmental encoding. This is the idea that all genes should be able to be reused at any point in time during the developmental process and at any location within the individual. Compositional Pattern Producing Networks (CPPNs) are an abstraction of this concept that have been show to be able to create patterns for repeating structures in Cartesian space. See some structures that were produced with CPPNs here:
A phenotype can be described as a function of n dimensions, where n is the number of phenotypic traits. What we see is the result of some transformation from genetic encoding to the exhibited traits. By composing simple functions, complex patterns can actually be easily represented. Things like symmetry, repetition, asymmetry, and variation all easily fall out of an encoding structure like this depending on the types of networks that are produced.
Weāll go a little bit deeper into the specifics of how CPPNs are specifically used in this context, but hopefully this gives you the general feel for why and how they are important in the context of indirect encodings.
In HyperNEAT, a bunch of familiar properties reappear for the original NEAT paper. Things like complexification over time are important (weāll start with simple and evolve complexity if and when itās needed). Historical markings will be used so that we can properly line up encodings for any sort of crossover. Uniform starting populations will also be used so that thereās no wildcard, incompatible networks from the start.
The major difference in how NEAT is used in this paper and the previous? Instead of using the NEAT algorithm to evolve neural networks directly, HyperNEAT uses NEAT to evolve CPPNs. This means that more āactivationā functions are used for CPPNs since things like Gaussians give rise to symmetry and trigonometric functions help with repetition of structure.
So now that weāve talked about what a CPPN is and that we use the NEAT algorithm to evolve and adjust them, it begs the question of how are these actually used in the overall HyperNEAT context?
First, we need to introduce the concept of a substrate. In the scope of HyperNEAT, a substrate is simply a geometric ordering of nodes. The simplest example could be a plane or a grid, where each discrete (x, y) point is a node. A connective CPPN will actually take two of these points and compute weight between these two nodes. We could think of that as the following equation:
CPPN(x1, y1, x2, y2) = w
Where CPPN is an evolved CPPN, like that of what weāve discussed in previous sections. We can see that in doing this, every single node will actually have some sort of weight connection between them (even allowing for recurrent connections). Connections can be positive or negative, and a minimum weight magnitude can also be defined so that any outputs below that threshold will result in no connection.
The geometric layout of nodes must be specified prior to the evolution of any CPPN. As a result, as the CPPN is evolved, the actually connection weights and network topology will result in a pattern that is geometric (all inputs are based on the positions of nodes).
In the case where the nodes are arranged on some sort of 2 dimensional plane or grid, the CPPN is a function of four dimensions and thus we can say it is being evolved on a four dimensional hypercube. This is where we get the name of the paper!
All regularities that weāve mentioned before can easily fall out of an encoding like this. Symmetry can occur by using symmetric functions over something like x1 and x2. This can be a function like a Gaussian function. Imperfect symmetry can occur when symmetry is used over things like both x and y, but only with respect to one axis.
Repetition also falls out, like weāve mentioned before, with periodic functions such as sine, cosine, etc. And like with symmetry, variation against repetition can be introduced by inducing a periodic function over a non-repeating aspect of the substrate. Thus all four major regularities that were aimed for are able to develop from this encoding.
You may have guessed from the above that the configuration of the substrate is critical. And that makes a lot of sense. In biology, the structure of something is tied to its functionality. Therefore, in our own evolution schema, the structure of our nodes are tightly linked to the functionality and performance that may be seen on a particular task.
Here, we can see a couple of substrate configurations specifically outlined in the original paper:
I think itās very important to look at the configuration that is a three dimensional cube and note how it simply adjusts our CPPN equation from four dimensional to six dimensional:
CPPN(x1, y1, z1, x2, y2, z2) = w
Also, the grid can be extended to the sandwich configuration by only allowing for nodes on one half to connect to the other half. This can be seen easily as an input/output configuration! The authors of the paper actually use this configuration to take in visual activation on the input half and use it to activate certain nodes on the output half.
The circular layout is also interesting, as geometry need not be a grid for a configuration. A radial geometry can be used instead, allowing for interesting behavioral properties to spawn out of the unique geometry that a circle can represent.
Inputs and outputs are laid out prior to the evolution of CPPNs. However, unlike a traditional neural network, our HyperNEAT algorithm is made aware of the geometry of the inputs and outputs and can learn to exploit and embrace the regularities of it. Locality and repetition of inputs and outputs can be easily exploited through this extra information that HyperNEAT receives.
Another powerful and unique property of HyperNEAT is the ability to scale the resolution of a substrate up and down. What does that mean? Well, letās say you evolve a HyperNEAT network based on images of a certain size. The underlying geometry that was exploited to perform well at that size results in the same pattern when scaled to a new size. Except, no extra training is needed. It simply scales to another size!
I think with all that information about how this algorithm works, itās worth summarizing the steps of it.
And there we have it! Thatās the HyperNEAT algorithm. I encourage you to take a look at the paper if you wish to explore more of the details or wish to look at the performance on some of the experiments they did with the algorithm (I particularly enjoy their food gathering robot experiment).
What are the implications for the future? Thatās something Iāve been thinking about recently as well. Is there a tie-in from HyperNEAT to training traditional deep networks today? Is this a better way to train deep networks? Thereās another paper of Evolvable Substrate HyperNEAT where the actual substrates are evolved as well, a paper I wish to explore in the future! But is there something hidden in that paper that bridges the gap between HyperNEAT and deep neural networks? Only time will tell and only we can answer that question!
Hopefully, this article was helpful! If I missed anything or if you have questions, let me know. Iām still learning and exploring a lot of this myself so Iād be more than happy to have a conversation about it on here or on Twitter.
And if you want to read more of what Iāve written, maybe check out:
Originally hosted at hunterheidenreich.com.