Categories
AI/ML CNNs deep dev ears robots

YAMNet

After changing the normalisation code, (because the resampled 16000Hz audio was too soft), we have recordings from the microphone being classified by yamnet on the jetson.

Pretty cool. It’s detecting chicken sounds. I had to renormalize the recording volume between -1 and 1, as everything was originally detected as ‘Silence’.

Currently I’m saving 5 second wav files and processing them in Jupyter. But it’s not really interactive in a real-time way, and it would need further training to detect distress, or other, more useful metrics.

Training instructions – however, there appears to be a tutorial.

We’re unlikely to have time to implement the transfer learning, to continue with the chicken stress vocalisation work, for this project, but it definitely looks like the way to go about it.

There are also some papers that used the VGG-11 architecture for this purpose, chopping up recordings into overlapping 1 second segments, for training, matching them to a label (stressed / not stressed). Note: If downloading the dataset, use the G-Drive link, not the figshare link, which is truncated.

Let’s try get YAMnet running in real time.

After following the installation procedure for my ‘Respeaker 2-Mic hat’, I’ve set up a dockerfile with TF2 and the audio libs, including librosa, in order to try out this real time version. Getting this right was a real pain, because of breaking changes in the ‘numba’ package.

FROM nvcr.io/nvidia/l4t-tensorflow:r32.6.1-tf2.5-py3

RUN apt-get update && apt-get install -y curl build-essential
RUN apt-get update && apt-get install -y libffi6 libffi-dev

RUN pip3 install -U Cython
RUN pip3 install -U pillow
RUN pip3 install -U numpy
RUN pip3 install -U scipy
RUN pip3 install -U matplotlib
RUN pip3 install -U PyWavelets
RUN pip3 install -U kiwisolver

RUN apt-get update && \
    apt-get install -y --no-install-recommends \
            alsa-base \
            libasound2-dev \
            alsa-utils \
            portaudio19-dev \
            libsndfile1 \
            unzip \
    && rm -rf /var/lib/apt/lists/* \
    && apt-get clean


RUN pip3 install soundfile pyaudio wave
RUN pip3 install tensorflow_hub
RUN pip3 install packaging
RUN pip3 install pyzmq==17.0.0
RUN pip3 install jupyterlab

RUN apt-get update && apt-get install -y libblas-dev \
				      liblapack-dev \
				      libatlas-base-dev \
				      gfortran \
				      protobuf-compiler \
				      libprotoc-dev \
		                      llvm-9 \
				      llvm-9-dev 

RUN export LLVM_CONFIG=/usr/lib/llvm-9/bin/llvm-config && pip3 install llvmlite==0.36.0
RUN pip3 install --upgrade pip
RUN python3 -m pip install --user -U numba==0.53.1
RUN python3 -m pip install --user -U librosa==0.9.2
#otherwise matplotlib can't draw to gui
RUN apt-get update && apt-get install -y python3-tk


RUN jupyter lab --generate-config

RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')"

EXPOSE 6006
EXPOSE 8888

CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \
        echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \
        echo "JupterLab logging location:  /var/log/jupyter.log  (inside the container)" && \
        /bin/bash


I'm running it with

sudo docker run -it --rm --runtime nvidia --network host --privileged=true -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/chicken/:/home/chicken nano_tf2_yamnet