Categories
CNNs Vision

Visualize CNNs

https://github.com/fg91/visualizing-cnn-feature-maps

Image for post
Image for post

“There are two main ways to try to understand how a neural network recognizes a certain pattern. If you want to know what kind of pattern significantly activates a certain feature map you could 1) either try to find images in a dataset that result in a high average activation of this feature map or you could 2) try to generate such a pattern by optimizing the pixel values in a random image. The latter idea was proposed by Erhan et al. 2009

from: https://towardsdatascience.com/how-to-visualize-convolutional-features-in-40-lines-of-code-70b7d87b0030


class FilterVisualizer():
    def __init__(self, size=56, upscaling_steps=12, upscaling_factor=1.2):
        self.size, self.upscaling_steps, self.upscaling_factor = size, upscaling_steps, upscaling_factor
        self.model = vgg16(pre=True).cuda().eval()
        set_trainable(self.model, False)

    def visualize(self, layer, filter, lr=0.1, opt_steps=20, blur=None):
        sz = self.size
        img = np.uint8(np.random.uniform(150, 180, (sz, sz, 3)))/255  # generate random image
        activations = SaveFeatures(list(self.model.children())[layer])  # register hook

        for _ in range(self.upscaling_steps):  # scale the image up upscaling_steps times
            train_tfms, val_tfms = tfms_from_model(vgg16, sz)
            img_var = V(val_tfms(img)[None], requires_grad=True)  # convert image to Variable that requires grad
            optimizer = torch.optim.Adam([img_var], lr=lr, weight_decay=1e-6)
            for n in range(opt_steps):  # optimize pixel values for opt_steps times
                optimizer.zero_grad()
                self.model(img_var)
                loss = -activations.features[0, filter].mean()
                loss.backward()
                optimizer.step()
            img = val_tfms.denorm(img_var.data.cpu().numpy()[0].transpose(1,2,0))
            self.output = img
            sz = int(self.upscaling_factor * sz)  # calculate new image size
            img = cv2.resize(img, (sz, sz), interpolation = cv2.INTER_CUBIC)  # scale image up
            if blur is not None: img = cv2.blur(img,(blur,blur))  # blur image to reduce high frequency patterns
        self.save(layer, filter)
        activations.close()
        
    def save(self, layer, filter):
        plt.imsave("layer_"+str(layer)+"_filter_"+str(filter)+".jpg", np.clip(self.output, 0, 1))

and use it like this:

layer = 40
filter = 265
FV = FilterVisualizer(size=56, upscaling_steps=12, upscaling_factor=1.2)
FV.visualize(layer, filter, blur=5)
Categories
chicken_research CNNs deep

Bird audio CNNs

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13103

Contest in 2017 for bird call neural nets. Best ones used mel spectrogram of the audio with convolutional neural networks.

The winning algorithm’s code is available: https://jobim.ofai.at/gitlab/gr/bird_audio_detection_challenge_2017/tree/master

Categories
Gripper

Robot Grasping

https://twimlai.com/dex-net-and-the-third-wave-of-robot-learning/

According to Ken, there are three fundamental elements of uncertainty that make robot grasping extremely difficult:

Perception. Understanding the precise geometry of where everything is in a scene can be a complex task. There have been developments in depth sensors like LIDAR, “but they still don’t completely solve this problem because if there’s anything reflective or transparent on the surface, that causes the light to react in unpredictable ways, it doesn’t register as a correct position of where that surface really is.” Adding additional sensors doesn’t help much because they often create contradictions, “[the agent] doesn’t know what to trust” in order to act correctly. Perception is especially important in grasping because “a millimeter or less can make the difference between holding something and dropping it.”

Control.
  The robot has to maintain control of its grasp meaning, “The robot has to now get its gripper to the precise position in space, consistent with what it believes is happening from its sensors.” If the gripper moves slightly or holds it too tight, the object can drop or break.

Physics.
 This has to do with choosing the right place to grasp the object, understanding friction and mass are significant unknowns. To demonstrate how difficult this is, Ken gives the example of pushing a pencil across the table with your finger. We can estimate the pencil’s center of mass, but we ultimately do not know the frictional properties at play. It’s almost impossible to predict the trajectory because even “one microscopic grain of sand, anything under there is going to cause it to behave extremely differently.” 

https://berkeleyautomation.github.io/dex-net/

The first wave is the “classic physics” approach which prioritizes traditional understandings of physics in terms of forces, and torques, friction, mass — all that good stuff. The second wave is the more modern, “data-driven approaches that say: ‘Forget about the physics, let’s just learn it from observation purely’” and assume the physics will be learned naturally in the process. 

Then there’s what Ken advocates for, which is the third wave of robot learning that combines the two fields of thought. The goal is to synthesize the knowledge from both perspectives to optimize performance. However, “figuring out where that combination is is the challenge. And that’s really the story of Dex-Net.”

Categories
Vision

Monocular SLAM

For drawing a map of a place. (Simultaneous localization and mapping). Monocular meaning single camera.

https://vision.in.tum.de/research/vslam/lsdslam

https://ubilang.wordpress.com/2016/05/07/orb-slam-vs-lsd-slam/

and point clouds

http://pointclouds.org/blog/tocs/alexandrov/index.php

https://github.com/PointCloudLibrary/pcl

https://github.com/raulmur/ORB_SLAM2

https://github.com/tum-vision/lsd_slam

Categories
Vision

ImageHub and ImageNodes

I set up the https://github.com/jeffbass/imagenode and https://github.com/jeffbass/imagehub and https://github.com/jeffbass/imagezmq from earlier

I needed these. (latest openCV has bug)

pip3 install pyyaml numpy virtualenv zmq imutils psutil picamera
pip3 install opencv-contrib-python==4.1.0.25

On the imagehub side, it finds /root/imagenode.yaml and sets up a folder.

Then the imagenode side, it looks for the directory structure with imagenode/ imagezmq/ and imagenode.yaml in the parent folder. You replace the contents with the YAML examples in the tests folder.

Then when it detects motion in the blue box, it takes pics that arrive in ~/imagehub_data/images/2020-04-12#

So this is a good start for various applications.

Categories
links

Links interlude

https://www.nature.com/articles/nature14422 and MAP-Elites algorithm: https://www.researchgate.net/publication/277323278_Robots_that_can_adapt_like_animals

https://github.com/benureau/recode/tree/master/cully2015

https://github.com/resibots/ite_v2

http://picbreeder.org/

http://www.cs.uvm.edu/~jbongard/papers/2014_PLOSCompBio_Auerbach.pdf

very cool: http://eplex.cs.ucf.edu/ecal13/demo/PCA.html?uid=192750

https://www.creativemachineslab.com/

http://eplex.cs.ucf.edu/

https://www.shadowrobot.com/products/modular-grasper/

https://www.coroma-project.eu/

Categories
AI/ML dev evolution

ONS variant of HyperNEAT

https://open.uct.ac.za/bitstream/handle/11427/27910/thesis_sci_2018_didi_sabre_z.pdf?sequence=1&isAllowed=y

The TD methods, were further compared to variants of NEAT and HyperNEAT (that is, OS, ONS, NS, OGN and GNS) with and without behavior transfer. The results demonstrated that the ONS variant of HyperNEAT performs much better (with respect to effectiveness and efficiency) than both TD methods and all variants of NEAT. Specific
evolutionary search methods to direct NE such as behavior diversity maintenance and the hybrid approach, work most effectively at balancing exploration versus exploitation in the search space, more so than TD methods.

Evolutionary search approaches investigated were objective-based search (OS), novelty search (NS), genotypic diversity search (GNS), hybrid of objective and novelty search
and hybrid of objective based and genotypic diversity maintenance search (ONS and OGN, respectively). In this thesis, three methodological features were explored to
ascertain an appropriate combination that enables the evolution of high quality solutions based on effectiveness (task performance) and efficiency (speed of adaptation)
of evolved behaviors. These features are as follows: First, direct versus indirect encoding neuro-evolution methods for collective behavior evolution (that is, NEAT and
HyperNEAT, respectively). Second, non-objective evolutionary search versus objective based search approach for guiding collective behavior evolution. Third, neuro-evolution
with collective behavior transfer.

“with behavior transfer” referring more or less to crossover/mutate rather than starting from scratch with new individuals. Something like that.

https://github.com/sdidi/KeepawaySim

Categories
AI/ML CNNs Vision

Self Attention

https://attentionagent.github.io/ there is no conscious perception of the visual world without attention to it

http://papers.nips.cc/paper/8302-stand-alone-self-attention-in-vision-models

and the difference between them and conv nets

https://openreview.net/forum?id=HJlnC1rKPB

https://github.com/epfml/attention-cnn

Categories
AI/ML dev envs neuro simulation

OpenAI Gym MultiNEAT

ok also, just saw this: https://gym.openai.com/evaluations/eval_a0YXWDc4SKeJjyTH7IrHBg/

it doesn’t work apparrently, but could be salvaged into something,

possibly written by this guy https://blog.otoro.net/

https://attentionagent.github.io/ there is no conscious perception of the visual world without attention to it

# Using ES-HyperNEAT to try to solve the Bipedal walker.
# This attempt was not successful. Adjustment of hyperparameters is likely needed.

# A neural network is trained using NeuroEvolution of Augmenting Topologies
# The idea is from the paper: "Evolving Neural Networks through Augmenting Topologies"
# This gist is using MultiNEAT (http://multineat.com/)

import logging
import numpy as np
import pickle

import gym

import MultiNEAT as NEAT

# NEAT setup
params = NEAT.Parameters()
params.PopulationSize = 200;

params.DynamicCompatibility = True;
params.CompatTreshold = 2.0;
params.YoungAgeTreshold = 15;
params.SpeciesMaxStagnation = 100;
params.OldAgeTreshold = 35;
params.MinSpecies = 5;
params.MaxSpecies = 10;
params.RouletteWheelSelection = False;

params.MutateRemLinkProb = 0.02;
params.RecurrentProb = 0;
params.OverallMutationRate = 0.15;
params.MutateAddLinkProb = 0.08;
params.MutateAddNeuronProb = 0.01;
params.MutateWeightsProb = 0.90;
params.MaxWeight = 8.0;
params.WeightMutationMaxPower = 0.2;
params.WeightReplacementMaxPower = 1.0;

params.MutateActivationAProb = 0.0;
params.ActivationAMutationMaxPower = 0.5;
params.MinActivationA = 0.05;
params.MaxActivationA = 6.0;

params.MutateNeuronActivationTypeProb = 0.03;

params.ActivationFunction_SignedSigmoid_Prob = 0.0;
params.ActivationFunction_UnsignedSigmoid_Prob = 0.0;
params.ActivationFunction_Tanh_Prob = 1.0;
params.ActivationFunction_TanhCubic_Prob = 0.0;
params.ActivationFunction_SignedStep_Prob = 1.0;
params.ActivationFunction_UnsignedStep_Prob = 0.0;
params.ActivationFunction_SignedGauss_Prob = 1.0;
params.ActivationFunction_UnsignedGauss_Prob = 0.0;
params.ActivationFunction_Abs_Prob = 0.0;
params.ActivationFunction_SignedSine_Prob = 1.0;
params.ActivationFunction_UnsignedSine_Prob = 0.0;
params.ActivationFunction_Linear_Prob = 1.0;

params.DivisionThreshold = 0.5;
params.VarianceThreshold = 0.03;
params.BandThreshold = 0.3;
params.InitialDepth = 2;
params.MaxDepth = 3;
params.IterationLevel = 1;
params.Leo = False;
params.GeometrySeed = False;
params.LeoSeed = False;
params.LeoThreshold = 0.3;
params.CPPN_Bias = -1.0;
params.Qtree_X = 0.0;
params.Qtree_Y = 0.0;
params.Width = 1.;
params.Height = 1.;
params.Elitism = 0.1;

rng = NEAT.RNG()
rng.TimeSeed()

list = []

for i in range(0,14):
	list.append((-1. +(2.*i/13.), -1., 0.))

for i in range(0,10):
	list.append((-1. +(2.*i/9), -0.5, 0))


substrate = NEAT.Substrate(list,
                           [],
                           [(-1., 1., 0.), (-0.5, 1., 0.), (0.5, 1., 0.), (1., 1., 0.)])

substrate.m_allow_input_hidden_links = False;
substrate.m_allow_input_output_links = False;
substrate.m_allow_hidden_hidden_links = False;
substrate.m_allow_hidden_output_links = False;
substrate.m_allow_output_hidden_links = False;
substrate.m_allow_output_output_links = False;
substrate.m_allow_looped_hidden_links = True;
substrate.m_allow_looped_output_links = False;

substrate.m_allow_input_hidden_links = True;
substrate.m_allow_input_output_links = False;
substrate.m_allow_hidden_output_links = True;
substrate.m_allow_hidden_hidden_links = True;

substrate.m_hidden_nodes_activation = NEAT.ActivationFunction.SIGNED_SIGMOID;
substrate.m_output_nodes_activation = NEAT.ActivationFunction.UNSIGNED_SIGMOID;

substrate.m_with_distance = False;

substrate.m_max_weight_and_bias = 8.0;


def trainNetwork(env, seed):
    # Training parameters
    generationSize = 50
    episode_count = 10
    max_steps = 1000
    # Max reward for environments that reward 1 for each succesfull step (e.g. CartPole-v0)
    max_reward = episode_count * max_steps

    def evaluate(genome):
        net = NEAT.NeuralNetwork()
        genome.BuildESHyperNEATPhenotype(net, substrate, params)

        cum_reward = 0

        for i in xrange(episode_count):
            ob = env.reset()
            net.Flush()

            for j in xrange(max_steps):
                # get next action
                net.Input(ob)
                net.Activate()
                o = net.Output()
                action = np.clip(o,-1,1)
                ob, reward, done, _ = env.step(action)
                cum_reward += reward
                if done:
                    break

        return cum_reward

    # Create initial genome
    g = NEAT.Genome(0, 24, 0, 4, False, 
                    NEAT.ActivationFunction.TANH, NEAT.ActivationFunction.TANH, 0, params)
    pop = NEAT.Population(g, params, True, 1.0, seed)

    current_best = None

    for generation in range(generationSize):
        for i_episode, genome in enumerate(NEAT.GetGenomeList(pop)):
            reward = evaluate(genome)

            if reward == max_reward:
                return pickle.dumps(genome)

            genome.SetFitness(reward)

        print('Generation: {}, max fitness: {}'.format(generation,
                            max((x.GetFitness() for x in NEAT.GetGenomeList(pop)))))
        current_best = pickle.dumps(pop.GetBestGenome())
        pop.Epoch()


    return current_best

env_name = "BipedalWalker"

if __name__ == '__main__':
    # Test the algorithm multiple times
    for test_case in xrange(0, 1):
        # setup logger, environment and monitor
        logger = logging.getLogger()
        logger.setLevel(logging.INFO)
        env = gym.make("%s-v2" % env_name)
        outdir = "/tmp/neat-%s-results-%d" % (env_name, test_case)
        env.monitor.start(outdir, force=True)

        # Train network
        learned = trainNetwork(env, test_case)

        # Test trained network on 1000 episodes
        learned_genome = pickle.loads(learned)
        net = NEAT.NeuralNetwork()
        learned_genome.BuildESHyperNEATPhenotype( net,substrate, params)

        episode_count = 1000
        max_steps = 1000

        for i in xrange(episode_count):
            ob = env.reset()
            net.Flush()

            for j in xrange(max_steps):
                # get next action
                net.Input(ob)
                net.Activate()
                o = net.Output()
                action = np.clip(o,-1,1)
                ob, reward, done, _ = env.step(action)
                if done:
                    break


        # Dump result info to disk
        env.monitor.close()
Categories
AI/ML evolution

HyperNEAT art

https://www.food4rhino.com/app/octopus#lg=1&slide=7

and this bonkers shape searcher software https://youtu.be/SlyXJEO76BI https://www.food4rhino.com/app/octopus