Category: neuro

NeuralSim

Post author By DJ Bro Bot
Post date July 29, 2022

We’ve gone a totally different way, but this is another interesting project from Erwin Coumans, on the Google Brain team, who did PyBullet. NeuralSim replaces parts of physics engines with neural networks.

https://sites.google.com/usc.edu/neuralsim

https://github.com/google-research/tiny-differentiable-simulator

AI/ML Behaviour bio chicken_research control deep dev ears evolution highly_speculative neuro UI

Hierarchical Temporal Memory

Post author By DJ Bro Bot
Post date October 26, 2021

Here I’m continuing with the task of unsupervised detection of audio anomalies, hopefully for the purpose of detecting chicken stress vocalisations.

After much fussing around with the old Numenta NuPic codebase, I’m porting the older nupic.audio and nupic.critic code, over to the more recent htm.core.

These are the main parts:

Sparse Distributed Representation (SDR)
Encoders
Spatial Pooler (SP)
Temporal Memory (TM)

I’ve come across a very intricate implementation and documentation, about understanding the important parts in the HTM model, way deep, like how did I get here? I will try implement the ‘critic’ code, first. Or rather, I’ll try port it from nupic to htm. After further investigation, there’s a few options, and I’m going to try edit the hotgym example, and try shove wav files frequency band scalars through it instead of power consumption data. I’m simplifying the investigation. But I need to make some progress.

I’m using this docker to get in, mapping my code and wav file folder in:

docker run -d -p 8888:8888 --name jupyter -v /media/chrx/0FEC49A4317DA4DA/sounds/:/home/jovyan/work 3rdman/htm.core-jupyter:latest



So I've got some code working that writes to 'nupic format' (.csv) and code that reads the amplitudes from the csv file, and then runs it through htm.core. 

So it takes a while, and it's just for 1 band (of 10 bands). I see it also uses the first 1/4 of so of the time to know what it's dealing with.  Probably need to run it through twice to get predictive results in the first 1/4. 

Ok no, after a few weeks, I've come back to this point, and realise that the top graph is the important one.  Prediction is what's important.  The bottom graphs are the anomaly scores, used by the prediction.

The idea in nupic.critic, was to threshold changes in X bands. Let’s see the other graphs…

Ok Frequency bands 7, 8, 9 were all zero amplitude. So that’s the highest the frequencies went. Just gotta check what those frequencies are, again…

Opening 307.wav
Sample width (bytes): 2
Frame rate (sampling frequency): 48000
Number of frames: 20771840
Signal length: 20771840
Seconds: 432
Dimensions of periodogram: 4801 x 2163

Ok with 10 buckets, 4801 would divide into 
Frequency band 0: 0-480Hz
Frequency band 1: 480-960Hz
Frequency band 2: 960-1440Hz
Frequency band 3: 1440-1920Hz
Frequency band 4: 1920-2400Hz
Frequency band 5: 2400-2880Hz
Frequency band 6: 2880-3360Hz

Ok what else. We could try segment the audio by band, so we can narrow in on the relevant frequency range, and then maybe just focus on that smaller range, again, in higher detail.

Learning features with some labeled data, is probably the correct way to do chicken stress vocalisation detections.

Unsupervised anomaly detection might be totally off, in terms of what an anomaly is. It is probably best, to zoom in on the relevant bands and to demonstrate a minimal example of what a stressed chicken sounds like, vs a chilled chicken, and compare the spectrograms to see if there’s a tell-tale visualisable feature.

A score from 1 to 5 for example, is going to be anomalous in arbitrary ways, without labelled data. Maybe the chickens are usually stressed, and the anomalies are when they are unstressed, for example.

A change in timing in music might be defined, in some way. like 4 out of 7 bands exhibiting anomalous amplitudes. But that probably won’t help for this. It’s probably just going to come down to a very narrow band of interest. Possibly pointing it out on a spectrogram that’s zoomed in on the feature, and then feeding the htm with an encoding of that narrow band of relevant data.

I’ll continue here, with some notes on filtering. After much fuss, the sox app (apt-get install sox) does it, sort of. Still working on python version.

                                                                              $ sox 307_0_50.wav filtered_50_0.wav sinc -n 32767 0-480
$ sox 307_0_50.wav filtered_50_1.wav sinc -n 32767 480-960
$ sox 307_0_50.wav filtered_50_2.wav sinc -n 32767 960-1440
$ sox 307_0_50.wav filtered_50_3.wav sinc -n 32767 1440-1920
$ sox 307_0_50.wav filtered_50_4.wav sinc -n 32767 1920-2400
$ sox 307_0_50.wav filtered_50_5.wav sinc -n 32767 2400-2880
$ sox 307_0_50.wav filtered_50_6.wav sinc -n 32767 2880-3360


So, sox does seem to be working.  The mel spectrogram is logarithmic, which is why it looks like this.

Visually, it looks like I'm interested in 2048 to 4096 Hz.  That's where I can see the chirps.

Hmm. So I think the spectrogram is confusing everything.

So where does 4800 come from? 48 kHz. 48,000 Hz (48 kHz) is the sample rate “used for DVDs“.

Ah. Right. The spectrogram values represent buckets of 5 samples each, and the full range is to 24000…?

Sample width (bytes): 2

0.     5.    10.    15.    20.    25.    30.    35.    40.    45.    50.    55.    60.    65.    70.    75.    80.    85.    90.    95.    100.   105.   110.   115.   120.   125.   130.   135.   140.   145.
...
 23950. 23955. 23960. 23965. 23970. 23975. 23980. 23985. 23990. 23995. 24000.]

ok. So 2 x 24000. Maybe 2 channels? Anyway, full range is to 48000Hz. In that case, are the bands actually…

Frequency band 0: 0-4800Hz
Frequency band 1: 4800-9600Hz
Frequency band 2: 9600-14400Hz
Frequency band 3: 14400-19200Hz
Frequency band 4: 19200-24000Hz
Frequency band 5: 24000-28800Hz
Frequency band 6: 28800-33600Hz

Ok so no, it’s half the above because of the sample width of 2.

Frequency band 0: 0-2400Hz
Frequency band 1: 2400-4800Hz
Frequency band 2: 4800-7200Hz
Frequency band 3: 7200-9600Hz
Frequency band 4: 9600-12000Hz
Frequency band 5: 12000-14400Hz
Frequency band 6: 14400-16800Hz

So why is the spectrogram maxing at 8192Hz? Must be spectrogram sampling related.

So the original signal is 0 to 24000Hz, and the spectrogram must be 8192Hz because… the spectrogram is made some way. I’ll try get back to this when I understand it.

sox 307_0_50.wav filtered_50_0.wav sinc -n 32767 0-2400
sox 307_0_50.wav filtered_50_1.wav sinc -n 32767 2400-4800
sox 307_0_50.wav filtered_50_2.wav sinc -n 32767 4800-7200
sox 307_0_50.wav filtered_50_3.wav sinc -n 32767 7200-9600
sox 307_0_50.wav filtered_50_4.wav sinc -n 32767 9600-12000
sox 307_0_50.wav filtered_50_5.wav sinc -n 32767 12000-14400
sox 307_0_50.wav filtered_50_6.wav sinc -n 32767 14400-16800

Ok I get it now.

Ok i don’t entirely understand the last two. But basically the mel spectrogram is logarithmic, so those high frequencies really don’t get much love on the mel spectrogram graph. Buggy maybe.

But I can estimate now the chirp frequencies…

sox 307_0_50.wav filtered_bird.wav sinc -n 32767 1800-5200

Beautiful. So, now to ‘extract the features’…

So, the nupic.critic code with 1 bucket managed to get something resembling the spectrogram. Ignore the blue.

But it looks like maybe, we can even just threshold and count peaks. That might be it.

sox 307.wav filtered_307.wav sinc -n 32767 1800-5200
sox 3072.wav filtered_3072.wav sinc -n 32767 1800-5200
sox 237.wav filtered_237.wav sinc -n 32767 1800-5200
sox 98.wav filtered_98.wav sinc -n 32767 1800-5200

Let’s do the big files…

Ok looks good enough.

So now I’m plotting the ‘chirp density’ (basically volume).

In this scheme, we just proxy chirp volume density as a variable representing stress. We don’t know if it is a true proxy.
As you can see, some recordings have more variation than others.

Some heuristic could be decided upon, for rating the stress from 1 to 5. The heuristic depends on how the program would be used. For example, if it were streaming audio, for an alert system, it might alert upon some duration of time spent above one standard deviation from the rolling mean. I’m not sure how the program would be used though.

If the goal were to differentiate stressed and not stressed vocalisations, that would require labelled audio data.

(Also, basically didn’t end up using HTM, lol)

Behaviour bio chicken research chicken_research deep dev ears institutes neuro The Chicken Experience

Stress Vocalisations

Post author By DJ Bro Bot
Post date October 26, 2021

We’ve spoken with Dr. Maksimiljan Brus, at the University of Maribor, and he’s sent us some WAV file recordings of a large group of chickens.

There seems to be a decent amount of work done, particularly at Georgia Tech, regarding categorizing chicken sounds, to detect stress, or bronchitis, etc. They’ve also done some experiments to see how chickens react to humans and robots. (It takes them about 3 weeks to get used to either).

In researching the topic, there was a useful South African document related to smallholding size chicken businesses. It covers everything. Very good resource, actually, and puts into perspective the relative poverty in the communities where people sell chickens for a living. The profit margin per chicken in 2013 was about R12 per live chicken (less than 1 euro).

From PRODUCTION GUIDELINES
for Small-Scale Broiler Enterprises
K Ralivhesa, W van Averbeke
& FK Siebrits

So anyway, I’m having a look at the sound files, to see what data and features I can extract. There’s no labels, so there won’t be any reinforcement learning here. Anomaly detection doesn’t need labels, and can use moving window statistics, to notice when something is out of the ordinary. So that’s what I’m looking into.

I am personally interested in Numenta’s algorithms, such as HTM, which use a model of cortical columns, and sparse encodings, to predict, and detect anomalies. I looked into getting Nupic.critic working, but Nupic is so old now, written in Python 2, that it’s practically impossible to get working. There is a community fork, htm.core, updated to Python 3, but it’s missing parts of the nupic codebase that nupic.critic is relying on. I’m able to convert the sound files to the nupic format, but am stuck for now, when running the analysis.

So let’s start at a more basic level and work our way up.

I downloaded Praat, an interesting sound analysis program used for some audio research. Not sure if it’s useful here. But it’s able to show various sound features. I’ll close it again, for now.

So, first thing to do, is going to be Mel spectrograms, and possibly Mel Frequency Cepstral Coefficients (MFCCs). The Mel scale kinda allows a difference between 250Hz and 500Hz to be scaled to the same size as a difference between 13250Hz and 13500Hz. It’s log-scaled.

Mel spectrograms let you use visual tools on audio. Also, worth knowing what a feature is, in machine learning. It’s a measurable property.

Ok where to start? Maybe librosa and PyOD?

pip install librosa

Ok and this outlier detection medium writeup, PyOD, says

Neural Networks

Neural networks can also be trained to identify anomalies.

Autoencoder (and variational autoencoder) network architectures can be trained to identify anomalies without labeled instances. Autoencoders learn to compress and reconstruct the information in data. Reconstruction errors are then used as anomaly scores.

More recently, several GAN architectures have been proposed for anomaly detection (e.g. MO_GAAL).

There’s also the results of a group working on this sort of problem, here.

A relevant arxiv too:

ANOMALOUS SOUND DETECTION BASED ON
INTERPOLATION DEEP NEURAL NETWORK

And

UNSUPERVISED ANOMALOUS SOUND DETECTION VIA AUTOENCODER APPROACH

What is this DCASE?

Hmm so there is a challenge for it currently. It’s big in Japan. Here’s the winning solution:

Amazon programmers win an Amazon competition on anomaly detection.

Here was an illustrative example of an anomaly, of some machine sound.

And of course, there are more traditional? algorithms, (data-science algorithms). Here’s a medium article overview, for a submission to a heart murmur challenge. It mentions kapre, “Keras Audio Preprocessors – compute STFT, ISTFT, Melspectrogram, and others on GPU real-time.”

And I found ‘torchaudio‘,

Here’s a useful flowchart from a paper about edge sound analysis on a Teensy. Smart Audio Sensors (SASs). The code “computes the FFT and Mel coefficients of a recorded audio frame.”

Smart Audio Sensors in the Internet of Things
Edge for Anomaly Detection

I haven’t mentioned it, but of course FFT, Fast Fourier Transform, which converts audio to frequency bands, is going to be a useful tool, too. “The FFT’s importance derives from the fact that it has made working in the frequency domain equally computationally feasible as working in the temporal or spatial domain. ” – (wikipedia)

On the synthesis and possibly artistic end, there’s also MelGAN and the like.

Google’s got pipelines in kubernetes ? MLOps stuff.

Artistically speaking, it sounds like we want spectrograms. Someone implements one from scratch here, and there is a link to a good youtube video on relevant sound analysis ideas. Wide-band, vs. narrow-band, for example. Overlapping windows? They’re explaining STFT, which is used to make spectrograms.

There’s also something called Chirp Z transform.

Anyway. Good stuff. As always, I find the hardest part is finding your way back to your various dev environments. Ok I logged into the Jupyter running in the docker on the Jetson. ifconfig to get the ip, and http://192.168.101.115:8888/lab, voila).

Ok let’s see torchaudio’s colab… and pip install, ok… Here’s a summary of the colab.

Some ghostly Mel spectrogram stuff. Also, interesting ‘To recover a waveform from spectrogram, you can use GriffinLim.’

Ok let’s get our own dataset prepared. We need an anomaly detector. Let’s see…

———————— <LIBROSA INSTALLATION…> —————

Ok the librosa mel spectrogram is working, at least, so far. So these are the images for the 4 files Dr. Brus sent.

While looking for something like STFT to make a spectogram video, i came across this resource: Machine Hearing. Also this tome of ML resources.

Classification is maybe the best way to do this ML stuff. Then you can add labels to classes, and train a neural network to associate labels, and to categorise. So it would be ideal, if the data were pre-labelled, i.e. classified by chicken stress vocalisation experts. Like here is a soundset with metadata, that lets you classify sounds with labels, (with training).

So we really do need to use an anomaly detection algorithm, because I listened to the chickens for a little bit, and I’m not getting the nuances.

Here’s a relevant paper, which learns classes, for retroactive labelling. They’re recording a machine making sounds, and then humans label it. They say 1NN (k-nearest-neighbours) is hard to beat, but it’s memory intensive. “Nearest centroid (NC) combined with DBA has been shown to be competitive with kNN at a much smaller computational cost”.

Here’s pyAudioAnalysis

Perform unsupervised segmentation (e.g. speaker diarization) and extract audio thumbnails

Train and use audio regression models (example application: emotion recognition)
Apply dimensionality reduction to visualize audio data and content similarities

Here’s a cool visualiser, in tensorboard,

Ideally, I would like to use NuPIC.

https://numenta.com/resources/biological-and-machine-intelligence/

Ok, let’s hope this old link works, for a nupic docker.

sudo docker run -i -t numenta/nupic /bin/bash

Ok amazing. Ok right, trying to install matplotlib inside the docker crashes. urllib3. I’ve been here before. Right, I asked on the github issues. 14 days ago, I asked. I got htm.core working. But it doesn’t have nupic.data classes.

After bashing my head against the apparent impossibility to pip install urllib3 and matplotlib in a python 2.7 docker, I’ve decided I will have to port the older nupic.critic or nupic.audio code to htm.core.

I cleared up some harddrive space, and ran this docker:

docker run -d -p 8888:8888 --name jupyter 3rdman/htm.core-jupyter:latest

then get the token for the URL:
docker logs -f jupyter

There’s a lot to go through, and I’m a noob at HTM. So I will start a new article now, on HTM specifically, for this.

neuro

Reading Thoughts

Post author By DJ Bro Bot
Post date September 6, 2020

Elon Musk now has a robot that can do neocortex circuit grafts with 1000 electrodes, but Yukiyasu Kamitani has been trying to read the brain without invasive surgery, for a while now.

“In their Sónar+D Talk, neuroscientist Yukiuasu Kamitani and multidisciplinary artist Daito Manabe explain the processes behind their groundbreaking collaborative show Dissonant Imaginary, whereby AI is used to decode brain visualisation processes.”

He shows pictures to people, while they’re in an fMRI machine, measures the magnetic waves, stores something like a pixel map, and tries to recreate the images https://www.youtube.com/watch?v=pV-PX1UNXmo

(Functional magnetic resonance imaging or functional MRI (fMRI) measures brain activity by detecting changes associated with blood flow. This technique relies on the fact that cerebral blood flow and neuronal activation are coupled.) – https://en.wikipedia.org/wiki/Functional_magnetic_resonance_imaging

AI/ML neuro

OpenCog

Post author By DJ Bro Bot
Post date August 21, 2020

Eventually we’ll need AGI. So Ben Goertzel started OpenCog, which is about cognitive synergy

OpenCog is a more complicated, richer hybrid approach, which incorporates deep learning along with a lot of other stuff

Looks like a well developed framework https://wiki.opencog.org/w/Hands_On_With_OpenCog

AI/ML dev envs neuro simulation

OpenAI Gym MultiNEAT

Post author By DJ Bro Bot
Post date April 9, 2020

ok also, just saw this: https://gym.openai.com/evaluations/eval_a0YXWDc4SKeJjyTH7IrHBg/‘

it doesn’t work apparrently, but could be salvaged into something,

possibly written by this guy https://blog.otoro.net/

https://attentionagent.github.io/ “ there is no conscious perception of the visual world without attention to it“

# Using ES-HyperNEAT to try to solve the Bipedal walker.
# This attempt was not successful. Adjustment of hyperparameters is likely needed.

# A neural network is trained using NeuroEvolution of Augmenting Topologies
# The idea is from the paper: "Evolving Neural Networks through Augmenting Topologies"
# This gist is using MultiNEAT (http://multineat.com/)

import logging
import numpy as np
import pickle

import gym

import MultiNEAT as NEAT

# NEAT setup
params = NEAT.Parameters()
params.PopulationSize = 200;

params.DynamicCompatibility = True;
params.CompatTreshold = 2.0;
params.YoungAgeTreshold = 15;
params.SpeciesMaxStagnation = 100;
params.OldAgeTreshold = 35;
params.MinSpecies = 5;
params.MaxSpecies = 10;
params.RouletteWheelSelection = False;

params.MutateRemLinkProb = 0.02;
params.RecurrentProb = 0;
params.OverallMutationRate = 0.15;
params.MutateAddLinkProb = 0.08;
params.MutateAddNeuronProb = 0.01;
params.MutateWeightsProb = 0.90;
params.MaxWeight = 8.0;
params.WeightMutationMaxPower = 0.2;
params.WeightReplacementMaxPower = 1.0;

params.MutateActivationAProb = 0.0;
params.ActivationAMutationMaxPower = 0.5;
params.MinActivationA = 0.05;
params.MaxActivationA = 6.0;

params.MutateNeuronActivationTypeProb = 0.03;

params.ActivationFunction_SignedSigmoid_Prob = 0.0;
params.ActivationFunction_UnsignedSigmoid_Prob = 0.0;
params.ActivationFunction_Tanh_Prob = 1.0;
params.ActivationFunction_TanhCubic_Prob = 0.0;
params.ActivationFunction_SignedStep_Prob = 1.0;
params.ActivationFunction_UnsignedStep_Prob = 0.0;
params.ActivationFunction_SignedGauss_Prob = 1.0;
params.ActivationFunction_UnsignedGauss_Prob = 0.0;
params.ActivationFunction_Abs_Prob = 0.0;
params.ActivationFunction_SignedSine_Prob = 1.0;
params.ActivationFunction_UnsignedSine_Prob = 0.0;
params.ActivationFunction_Linear_Prob = 1.0;

params.DivisionThreshold = 0.5;
params.VarianceThreshold = 0.03;
params.BandThreshold = 0.3;
params.InitialDepth = 2;
params.MaxDepth = 3;
params.IterationLevel = 1;
params.Leo = False;
params.GeometrySeed = False;
params.LeoSeed = False;
params.LeoThreshold = 0.3;
params.CPPN_Bias = -1.0;
params.Qtree_X = 0.0;
params.Qtree_Y = 0.0;
params.Width = 1.;
params.Height = 1.;
params.Elitism = 0.1;

rng = NEAT.RNG()
rng.TimeSeed()

list = []

for i in range(0,14):
	list.append((-1. +(2.*i/13.), -1., 0.))

for i in range(0,10):
	list.append((-1. +(2.*i/9), -0.5, 0))


substrate = NEAT.Substrate(list,
                           [],
                           [(-1., 1., 0.), (-0.5, 1., 0.), (0.5, 1., 0.), (1., 1., 0.)])

substrate.m_allow_input_hidden_links = False;
substrate.m_allow_input_output_links = False;
substrate.m_allow_hidden_hidden_links = False;
substrate.m_allow_hidden_output_links = False;
substrate.m_allow_output_hidden_links = False;
substrate.m_allow_output_output_links = False;
substrate.m_allow_looped_hidden_links = True;
substrate.m_allow_looped_output_links = False;

substrate.m_allow_input_hidden_links = True;
substrate.m_allow_input_output_links = False;
substrate.m_allow_hidden_output_links = True;
substrate.m_allow_hidden_hidden_links = True;

substrate.m_hidden_nodes_activation = NEAT.ActivationFunction.SIGNED_SIGMOID;
substrate.m_output_nodes_activation = NEAT.ActivationFunction.UNSIGNED_SIGMOID;

substrate.m_with_distance = False;

substrate.m_max_weight_and_bias = 8.0;


def trainNetwork(env, seed):
    # Training parameters
    generationSize = 50
    episode_count = 10
    max_steps = 1000
    # Max reward for environments that reward 1 for each succesfull step (e.g. CartPole-v0)
    max_reward = episode_count * max_steps

    def evaluate(genome):
        net = NEAT.NeuralNetwork()
        genome.BuildESHyperNEATPhenotype(net, substrate, params)

        cum_reward = 0

        for i in xrange(episode_count):
            ob = env.reset()
            net.Flush()

            for j in xrange(max_steps):
                # get next action
                net.Input(ob)
                net.Activate()
                o = net.Output()
                action = np.clip(o,-1,1)
                ob, reward, done, _ = env.step(action)
                cum_reward += reward
                if done:
                    break

        return cum_reward

    # Create initial genome
    g = NEAT.Genome(0, 24, 0, 4, False, 
                    NEAT.ActivationFunction.TANH, NEAT.ActivationFunction.TANH, 0, params)
    pop = NEAT.Population(g, params, True, 1.0, seed)

    current_best = None

    for generation in range(generationSize):
        for i_episode, genome in enumerate(NEAT.GetGenomeList(pop)):
            reward = evaluate(genome)

            if reward == max_reward:
                return pickle.dumps(genome)

            genome.SetFitness(reward)

        print('Generation: {}, max fitness: {}'.format(generation,
                            max((x.GetFitness() for x in NEAT.GetGenomeList(pop)))))
        current_best = pickle.dumps(pop.GetBestGenome())
        pop.Epoch()


    return current_best

env_name = "BipedalWalker"

if __name__ == '__main__':
    # Test the algorithm multiple times
    for test_case in xrange(0, 1):
        # setup logger, environment and monitor
        logger = logging.getLogger()
        logger.setLevel(logging.INFO)
        env = gym.make("%s-v2" % env_name)
        outdir = "/tmp/neat-%s-results-%d" % (env_name, test_case)
        env.monitor.start(outdir, force=True)

        # Train network
        learned = trainNetwork(env, test_case)

        # Test trained network on 1000 episodes
        learned_genome = pickle.loads(learned)
        net = NEAT.NeuralNetwork()
        learned_genome.BuildESHyperNEATPhenotype( net,substrate, params)

        episode_count = 1000
        max_steps = 1000

        for i in xrange(episode_count):
            ob = env.reset()
            net.Flush()

            for j in xrange(max_steps):
                # get next action
                net.Input(ob)
                net.Activate()
                o = net.Output()
                action = np.clip(o,-1,1)
                ob, reward, done, _ = env.step(action)
                if done:
                    break


        # Dump result info to disk
        env.monitor.close()

AI/ML evolution neuro

HyperNEAT

Post author By DJ Bro Bot
Post date April 1, 2020

https://github.com/PacktPublishing/Hands-on-Neuroevolution-with-Python.git

Copy pasta:

HyperNEAT: Powerful, Indirect Neural Network Evolution

Hunter Heidenreich Follow Jan 10, 2019 · 9 min read

https://towardsdatascience.com/hyperneat-powerful-indirect-neural-network-evolution-fba5c7c43b7b

Expanding NeuroEvolution

Last week, I wrote an article about NEAT (NeuroEvolution of Augmenting Topologies) and we discussed a lot of the cool things that surrounded the algorithm. We also briefly touched upon how this older algorithm might even impact how we approach network building today, alluding to the fact that neural networks need not be built entirely by hand.

Today, we are going to dive into a different approach to neuroevolution, an extension of NEAT called HyperNEAT. NEAT, as you might remember, had a direct encoding for its network structure. This was so that networks could be more intuitively evolved, node by node and connection by connection. HyperNEAT drops this idea because in order to evolve a network like the brain (with billions of neurons), one would need a much faster way of evolving that structure.

HyperNEAT is a much more conceptually complex algorithm (in my opinion, at least) and even I am working on understanding the nuts and bolts of how it all works. Today, we will take a look under the hood and explore some of the components of this algorithm so that we might better understand what makes it so powerful and reason about future extensions in this age of deep learning.

HyperNEAT

Motivation

Before diving into the paper and algorithm, I think it’s worth exploring a bit more the motivation behind HyperNEAT.

The full name of the paper is “A Hypercube-Based Indirect Encoding for Evolving Large-Scale Neural Networks”, which is quite the mouthful! But already, we can see two of the major points. It’s a hypercube-based indirect encoding. We’ll get into the hypercube part later, but already we know that it’s a move from direct encodings to indirect encodings (see my last blog on NEAT for a more detailed description of some of the differences between the two). Furthermore, we get the major reasoning behind it as well: For evolving big neural nets!

More than that, the creators of this algorithm highlight that if one were to look at the brain, they see a “network” with billions of nodes and trillions of connections. They see a network that uses repetition of structure, reusing a mapping of the same gene to generate the same physical structure multiple times. They also highlight that the human brain is constructed in a way so as to exploit physical properties of the world: symmetry (have mirrors of structures, two eyes for input for example) and locality (where nodes are in the structure influences their connections and functions).

Contrast this what we know about neural networks, either built through an evolution procedure or constructed by hand and trained. Do any of these properties hold? Sure, if we force the network to have symmetry and locality, maybe… However, even then, take a dense, feed-forward network where all nodes in one layer are connected to all nodes in the next! And when looking at the networks constructed by the vanilla NEAT algorithm? They tend to be disorganized, sporadic, and not exhibit any of these nice regularities.

Enter in HyperNEAT! Utilizing an indirect encoding through something called connective Compositional Pattern Producing Networks (CPPNs), HyperNEAT attempts to exploit geometric properties to produce very large neural networks with these nice features that we might like to see in our evolved networks.

What’s a Compositional Pattern Producing Network?

In the previous post, we discussed encodings and today we’ll dive deeper into the indirect encoding used for HyperNEAT. Now, indirect encodings are a lot more common than you might think. In fact, you have one inside yourself!

DNA is an indirect encoding because the phenotypic results (what we actually see) are orders of magnitude larger than the genotypic content (the genes in the DNA). If you look at a human genome, we’ll say it has about 30,000 genes coding for approximately 3 billion amino acids. Well, the brain has 3 trillion connections. Obviously, there is something indirect going on here!

Something borrowed from the ideas of biology is an encoding scheme called developmental encoding. This is the idea that all genes should be able to be reused at any point in time during the developmental process and at any location within the individual. Compositional Pattern Producing Networks (CPPNs) are an abstraction of this concept that have been show to be able to create patterns for repeating structures in Cartesian space. See some structures that were produced with CPPNs here:

Pure CPPNs

A phenotype can be described as a function of n dimensions, where n is the number of phenotypic traits. What we see is the result of some transformation from genetic encoding to the exhibited traits. By composing simple functions, complex patterns can actually be easily represented. Things like symmetry, repetition, asymmetry, and variation all easily fall out of an encoding structure like this depending on the types of networks that are produced.

We’ll go a little bit deeper into the specifics of how CPPNs are specifically used in this context, but hopefully this gives you the general feel for why and how they are important in the context of indirect encodings.

Tie In to NEAT

In HyperNEAT, a bunch of familiar properties reappear for the original NEAT paper. Things like complexification over time are important (we’ll start with simple and evolve complexity if and when it’s needed). Historical markings will be used so that we can properly line up encodings for any sort of crossover. Uniform starting populations will also be used so that there’s no wildcard, incompatible networks from the start.

The major difference in how NEAT is used in this paper and the previous? Instead of using the NEAT algorithm to evolve neural networks directly, HyperNEAT uses NEAT to evolve CPPNs. This means that more “activation” functions are used for CPPNs since things like Gaussians give rise to symmetry and trigonometric functions help with repetition of structure.

The Algorithm

So now that we’ve talked about what a CPPN is and that we use the NEAT algorithm to evolve and adjust them, it begs the question of how are these actually used in the overall HyperNEAT context?

First, we need to introduce the concept of a substrate. In the scope of HyperNEAT, a substrate is simply a geometric ordering of nodes. The simplest example could be a plane or a grid, where each discrete (x, y) point is a node. A connective CPPN will actually take two of these points and compute weight between these two nodes. We could think of that as the following equation:

CPPN(x1, y1, x2, y2) = w

Where CPPN is an evolved CPPN, like that of what we’ve discussed in previous sections. We can see that in doing this, every single node will actually have some sort of weight connection between them (even allowing for recurrent connections). Connections can be positive or negative, and a minimum weight magnitude can also be defined so that any outputs below that threshold will result in no connection.

The geometric layout of nodes must be specified prior to the evolution of any CPPN. As a result, as the CPPN is evolved, the actually connection weights and network topology will result in a pattern that is geometric (all inputs are based on the positions of nodes).

In the case where the nodes are arranged on some sort of 2 dimensional plane or grid, the CPPN is a function of four dimensions and thus we can say it is being evolved on a four dimensional hypercube. This is where we get the name of the paper!

Regularities in the Produced Patterns

All regularities that we’ve mentioned before can easily fall out of an encoding like this. Symmetry can occur by using symmetric functions over something like x1 and x2. This can be a function like a Gaussian function. Imperfect symmetry can occur when symmetry is used over things like both x and y, but only with respect to one axis.

Repetition also falls out, like we’ve mentioned before, with periodic functions such as sine, cosine, etc. And like with symmetry, variation against repetition can be introduced by inducing a periodic function over a non-repeating aspect of the substrate. Thus all four major regularities that were aimed for are able to develop from this encoding.

Substrate Configuration

You may have guessed from the above that the configuration of the substrate is critical. And that makes a lot of sense. In biology, the structure of something is tied to its functionality. Therefore, in our own evolution schema, the structure of our nodes are tightly linked to the functionality and performance that may be seen on a particular task.

Here, we can see a couple of substrate configurations specifically outlined in the original paper:

I think it’s very important to look at the configuration that is a three dimensional cube and note how it simply adjusts our CPPN equation from four dimensional to six dimensional:

CPPN(x1, y1, z1, x2, y2, z2) = w

Also, the grid can be extended to the sandwich configuration by only allowing for nodes on one half to connect to the other half. This can be seen easily as an input/output configuration! The authors of the paper actually use this configuration to take in visual activation on the input half and use it to activate certain nodes on the output half.

The circular layout is also interesting, as geometry need not be a grid for a configuration. A radial geometry can be used instead, allowing for interesting behavioral properties to spawn out of the unique geometry that a circle can represent.

Input and Output Layout

Inputs and outputs are laid out prior to the evolution of CPPNs. However, unlike a traditional neural network, our HyperNEAT algorithm is made aware of the geometry of the inputs and outputs and can learn to exploit and embrace the regularities of it. Locality and repetition of inputs and outputs can be easily exploited through this extra information that HyperNEAT receives.

Substrate Resolution

Another powerful and unique property of HyperNEAT is the ability to scale the resolution of a substrate up and down. What does that mean? Well, let’s say you evolve a HyperNEAT network based on images of a certain size. The underlying geometry that was exploited to perform well at that size results in the same pattern when scaled to a new size. Except, no extra training is needed. It simply scales to another size!

Summarization of the Algorithm

I think with all that information about how this algorithm works, it’s worth summarizing the steps of it.

1. Chose a substrate configuration (the layout of nodes and where input/output is located)
2. Create a uniform, minimal initial population of connective CPPNs
3. Repeat until solution:
4. For each CPPN
(a) Generate connections for the neural network using the CPPN
(b) Evaluate the performance of the neural network
5. Reproduce CPPNs using NEAT algorithm

Conclusion

And there we have it! That’s the HyperNEAT algorithm. I encourage you to take a look at the paper if you wish to explore more of the details or wish to look at the performance on some of the experiments they did with the algorithm (I particularly enjoy their food gathering robot experiment).

What are the implications for the future? That’s something I’ve been thinking about recently as well. Is there a tie-in from HyperNEAT to training traditional deep networks today? Is this a better way to train deep networks? There’s another paper of Evolvable Substrate HyperNEAT where the actual substrates are evolved as well, a paper I wish to explore in the future! But is there something hidden in that paper that bridges the gap between HyperNEAT and deep neural networks? Only time will tell and only we can answer that question!

Hopefully, this article was helpful! If I missed anything or if you have questions, let me know. I’m still learning and exploring a lot of this myself so I’d be more than happy to have a conversation about it on here or on Twitter.

And if you want to read more of what I’ve written, maybe check out:

Originally hosted at hunterheidenreich.com.