After changing the normalisation code, (because the resampled 16000Hz audio was too soft), we have recordings from the microphone being classified by yamnet on the jetson.
Pretty cool. It’s detecting chicken sounds. I had to renormalize the recording volume between -1 and 1, as everything was originally detected as ‘Silence’.
Currently I’m saving 5 second wav files and processing them in Jupyter. But it’s not really interactive in a real-time way, and it would need further training to detect distress, or other, more useful metrics.
Training instructions – however, there appears to be a tutorial.
We’re unlikely to have time to implement the transfer learning, to continue with the chicken stress vocalisation work, for this project, but it definitely looks like the way to go about it.
There are also some papers that used the VGG-11 architecture for this purpose, chopping up recordings into overlapping 1 second segments, for training, matching them to a label (stressed / not stressed). Note: If downloading the dataset, use the G-Drive link, not the figshare link, which is truncated.
Let’s try get YAMnet running in real time.
After following the installation procedure for my ‘Respeaker 2-Mic hat’, I’ve set up a dockerfile with TF2 and the audio libs, including librosa, in order to try out this real time version. Getting this right was a real pain, because of breaking changes in the ‘numba’ package.
FROM nvcr.io/nvidia/l4t-tensorflow:r32.6.1-tf2.5-py3 RUN apt-get update && apt-get install -y curl build-essential RUN apt-get update && apt-get install -y libffi6 libffi-dev RUN pip3 install -U Cython RUN pip3 install -U pillow RUN pip3 install -U numpy RUN pip3 install -U scipy RUN pip3 install -U matplotlib RUN pip3 install -U PyWavelets RUN pip3 install -U kiwisolver RUN apt-get update && \ apt-get install -y --no-install-recommends \ alsa-base \ libasound2-dev \ alsa-utils \ portaudio19-dev \ libsndfile1 \ unzip \ && rm -rf /var/lib/apt/lists/* \ && apt-get clean RUN pip3 install soundfile pyaudio wave RUN pip3 install tensorflow_hub RUN pip3 install packaging RUN pip3 install pyzmq==17.0.0 RUN pip3 install jupyterlab RUN apt-get update && apt-get install -y libblas-dev \ liblapack-dev \ libatlas-base-dev \ gfortran \ protobuf-compiler \ libprotoc-dev \ llvm-9 \ llvm-9-dev RUN export LLVM_CONFIG=/usr/lib/llvm-9/bin/llvm-config && pip3 install llvmlite==0.36.0 RUN pip3 install --upgrade pip RUN python3 -m pip install --user -U numba==0.53.1 RUN python3 -m pip install --user -U librosa==0.9.2 #otherwise matplotlib can't draw to gui RUN apt-get update && apt-get install -y python3-tk RUN jupyter lab --generate-config RUN python3 -c "from notebook.auth.security import set_password; set_password('nvidia', '/root/.jupyter/jupyter_notebook_config.json')" EXPOSE 6006 EXPOSE 8888 CMD /bin/bash -c "jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &> /var/log/jupyter.log" & \ echo "allow 10 sec for JupyterLab to start @ http://$(hostname -I | cut -d' ' -f1):8888 (password nvidia)" && \ echo "JupterLab logging location: /var/log/jupyter.log (inside the container)" && \ /bin/bash I'm running it with sudo docker run -it --rm --runtime nvidia --network host --privileged=true -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix -v /home/chicken/:/home/chicken nano_tf2_yamnet