Categories
CNNs Vision

Visualize CNNs

https://github.com/fg91/visualizing-cnn-feature-maps

Image for post
Image for post

“There are two main ways to try to understand how a neural network recognizes a certain pattern. If you want to know what kind of pattern significantly activates a certain feature map you could 1) either try to find images in a dataset that result in a high average activation of this feature map or you could 2) try to generate such a pattern by optimizing the pixel values in a random image. The latter idea was proposed by Erhan et al. 2009

from: https://towardsdatascience.com/how-to-visualize-convolutional-features-in-40-lines-of-code-70b7d87b0030


class FilterVisualizer():
    def __init__(self, size=56, upscaling_steps=12, upscaling_factor=1.2):
        self.size, self.upscaling_steps, self.upscaling_factor = size, upscaling_steps, upscaling_factor
        self.model = vgg16(pre=True).cuda().eval()
        set_trainable(self.model, False)

    def visualize(self, layer, filter, lr=0.1, opt_steps=20, blur=None):
        sz = self.size
        img = np.uint8(np.random.uniform(150, 180, (sz, sz, 3)))/255  # generate random image
        activations = SaveFeatures(list(self.model.children())[layer])  # register hook

        for _ in range(self.upscaling_steps):  # scale the image up upscaling_steps times
            train_tfms, val_tfms = tfms_from_model(vgg16, sz)
            img_var = V(val_tfms(img)[None], requires_grad=True)  # convert image to Variable that requires grad
            optimizer = torch.optim.Adam([img_var], lr=lr, weight_decay=1e-6)
            for n in range(opt_steps):  # optimize pixel values for opt_steps times
                optimizer.zero_grad()
                self.model(img_var)
                loss = -activations.features[0, filter].mean()
                loss.backward()
                optimizer.step()
            img = val_tfms.denorm(img_var.data.cpu().numpy()[0].transpose(1,2,0))
            self.output = img
            sz = int(self.upscaling_factor * sz)  # calculate new image size
            img = cv2.resize(img, (sz, sz), interpolation = cv2.INTER_CUBIC)  # scale image up
            if blur is not None: img = cv2.blur(img,(blur,blur))  # blur image to reduce high frequency patterns
        self.save(layer, filter)
        activations.close()
        
    def save(self, layer, filter):
        plt.imsave("layer_"+str(layer)+"_filter_"+str(filter)+".jpg", np.clip(self.output, 0, 1))

and use it like this:

layer = 40
filter = 265
FV = FilterVisualizer(size=56, upscaling_steps=12, upscaling_factor=1.2)
FV.visualize(layer, filter, blur=5)
Categories
chicken_research CNNs deep

Bird audio CNNs

https://besjournals.onlinelibrary.wiley.com/doi/full/10.1111/2041-210X.13103

Contest in 2017 for bird call neural nets. Best ones used mel spectrogram of the audio with convolutional neural networks.

The winning algorithm’s code is available: https://jobim.ofai.at/gitlab/gr/bird_audio_detection_challenge_2017/tree/master

Categories
AI/ML CNNs Vision

Self Attention

https://attentionagent.github.io/ there is no conscious perception of the visual world without attention to it

http://papers.nips.cc/paper/8302-stand-alone-self-attention-in-vision-models

and the difference between them and conv nets

https://openreview.net/forum?id=HJlnC1rKPB

https://github.com/epfml/attention-cnn