Categories
AI/ML CNNs dev Locomotion OpenCV robots UI Vision

Realsense Depth and TensorRT object detection

A seemingly straightforward idea for robot control involves using depth, and object detection, to form a rough model of the environment.

After failed attempts to create our own stereo camera using two monocular cameras, we eventually decided to buy a commercial product instead, The Intel depth camera, D455.

After a first round of running a COCO trained MobileSSDv2 object detection network in Tensorflow 2 Lite, on the colour images obtained from the realsense camera, on the Jetson Nano, the results were just barely acceptable (~2 FPS) for a localhost stream, and totally unacceptable (~0.25 FPS) served as JPEG over HTTP, to a browser on the network.

Looking at the options, the only feasible solution was to redo the network using TensorRT, the NVIDIA-specific, quantized (16 bit on the Nano, 8 bit on the NX/AGX) neural network framework. The other solution involved investigating options other than simple JPEG compression over HTTP, such as RTSP and WebRTC.

The difficult part was setting up the environment. We used the NVIDIA detectnet code, adapted to take the realsense camera images as input, and to display the distance to the objects. An outdated example was found at CAVEDU robotics blog/github. Fixed up below.

#!/usr/bin/python3



import jetson_inference
import jetson_utils
import argparse
import sys
import os
import cv2
import re
import numpy as np
import io
import time
import json
import random
import pyrealsense2 as rs
from jetson_inference import detectNet
from jetson_utils import videoSource, videoOutput, logUsage, cudaFromNumpy, cudaAllocMapped, cudaConvertColor

parser = argparse.ArgumentParser(description="Locate objects in a live camera stream using an object detection DNN.",
formatter_class=argparse.RawTextHelpFormatter, epilog=jetson_utils.logUsage())
parser.add_argument("--network", type=str, default="ssd-mobilenet-v2",
help="pre-trained model to load (see below for options)")
parser.add_argument("--threshold", type=float, default=0.5,
help="minimum detection threshold to use")
parser.add_argument("--width", type=int, default=640,
help="set width for image")
parser.add_argument("--height", type=int, default=480,
help="set height for image")
opt = parser.parse_known_args()[0]

# load the object detection network
net = detectNet(opt.network, sys.argv, opt.threshold)

# Configure depth and color streams
pipeline = rs.pipeline()
config = rs.config()
config.enable_stream(rs.stream.depth, opt.width, opt.height, rs.format.z16, 30)
config.enable_stream(rs.stream.color, opt.width, opt.height, rs.format.bgr8, 30)
# Start streaming
pipeline.start(config)


press_key = 0
while (press_key==0):
	# Wait for a coherent pair of frames: depth and color
	frames = pipeline.wait_for_frames()
	depth_frame = frames.get_depth_frame()
	color_frame = frames.get_color_frame()
	if not depth_frame or not color_frame:
		continue
	# Convert images to numpy arrays
	depth_image = np.asanyarray(depth_frame.get_data())
	show_img = np.asanyarray(color_frame.get_data())
	
	# convert to CUDA (cv2 images are numpy arrays, in BGR format)
	bgr_img = cudaFromNumpy(show_img, isBGR=True)
	# convert from BGR -> RGB
	img = cudaAllocMapped(width=bgr_img.width,height=bgr_img.height,format='rgb8')
	cudaConvertColor(bgr_img, img)

	# detect objects in the image (with overlay)
	detections = net.Detect(img)

	for num in range(len(detections)) :
		score = round(detections[num].Confidence,2)
		box_top=int(detections[num].Top)
		box_left=int(detections[num].Left)
		box_bottom=int(detections[num].Bottom)
		box_right=int(detections[num].Right)
		box_center=detections[num].Center
		label_name = net.GetClassDesc(detections[num].ClassID)

		point_distance=0.0
		for i in range (10):
			point_distance = point_distance + depth_frame.get_distance(int(box_center[0]),int(box_center[1]))

		point_distance = np.round(point_distance / 10, 3)
		distance_text = str(point_distance) + 'm'
		cv2.rectangle(show_img,(box_left,box_top),(box_right,box_bottom),(255,0,0),2)
		cv2.line(show_img,
			(int(box_center[0])-10, int(box_center[1])),
			(int(box_center[0]+10), int(box_center[1])),
			(0, 255, 255), 3)
		cv2.line(show_img,
			(int(box_center[0]), int(box_center[1]-10)),
			(int(box_center[0]), int(box_center[1]+10)),
			(0, 255, 255), 3)
		cv2.putText(show_img,
			label_name + ' ' + distance_text,
			(box_left+5,box_top+20),cv2.FONT_HERSHEY_SIMPLEX,0.4,
			(0,255,255),1,cv2.LINE_AA)

	cv2.putText(show_img,
		"{:.0f} FPS".format(net.GetNetworkFPS()),
		(int(opt.width*0.8), int(opt.height*0.1)),
		cv2.FONT_HERSHEY_SIMPLEX,1,
		(0,255,255),2,cv2.LINE_AA)


	display = cv2.resize(show_img,(int(opt.width*1.5),int(opt.height*1.5)))
	cv2.imshow('Detecting...',display)
	keyValue=cv2.waitKey(1)
	if keyValue & 0xFF == ord('q'):
		press_key=1


cv2.destroyAllWindows()
pipeline.stop()

Assuming you have a good cmake version and cuda is available (if nvcc doesn’t work, you need to configure linker paths, check this link)… (if you have a cmake version around ~ 3.22-3.24 or so, you need an older one)… prerequisite sudo apt-get install libssl-dev also required.

The hard part was actually setting up the Realsense python bindings.

Clone the repo…

git clone https://github.com/IntelRealSense/librealsense.git

The trick being, to request the python bindings, and cuda, during the cmake phase. Note that often, none of this works. Some tips include…

sudo apt-get install xorg-dev libglu1-mesa-dev

and changing PYTHON to Python

mkdir build
cd build
cmake ../ -DBUILD_PYTHON_BINDINGS:bool=true -DPYTHON_EXECUTABLE=/usr/bin/python3 -DCMAKE_BUILD_TYPE=release -DBUILD_EXAMPLES=true -DBUILD_GRAPHICAL_EXAMPLES=true -DBUILD_WITH_CUDA:bool=true

The above worked on Jetpack 4.6.1, while the below worked on Jetpack 5.0.2

cmake ../ -DBUILD_PYTHON_BINDINGS:bool=true -DPython_EXECUTABLE=/usr/bin/python3.8 -DCMAKE_BUILD_TYPE=release -DBUILD_EXAMPLES=true -DBUILD_GRAPHICAL_EXAMPLES=true -DBUILD_WITH_CUDA:bool=true -DPYTHON_INCLUDE_DIRS=/usr/include/python3.8 -DPython_LIBRARIES=/usr/lib/aarch64-linux-gnu/libpython3.8.so

(and sudo make install)
Update the python path

export PYTHONPATH=$PYTHONPATH:/usr/local/lib
(or a specific python if you have more than one)
if installed in /usr/lib, change accordingly

Check that the folder is in the correct location (it isn't, after following official instructions).

./usr/local/lib/python3.6/dist-packages/pyrealsense2/

Check that the shared object files (.so) are in the right place: 

chicken@chicken:/usr/local/lib$ ls
cmake       libjetson-inference.so  librealsense2-gl.so.2.50    librealsense2.so.2.50    pkgconfig
libfw.a     libjetson-utils.so      librealsense2-gl.so.2.50.0  librealsense2.so.2.50.0  python2.7
libglfw3.a  librealsense2-gl.so     librealsense2.so            librealsense-file.a      python3.6


If it can't find 'pipeline', it means you need to copy the missing __init__.py file.

sudo cp ./home/chicken/librealsense/wrappers/python/pyrealsense2/__init__.py ./usr/local/lib/python3.6/dist-packages/pyrealsense2/

Some extra things to do, 
sudo cp 99-realsense-libusb.rules  /etc/udev/rules.d/

Eventually, I am able to run the inference on the realsense camera, at an apparent 25 FPS, on the localhost, drawing to an OpenGL window.

I also developed a Dockerfile, for the purpose, which benefits from an updated Pytorch version, but various issues were encountered, making a bare-metal install far simpler, ultimately. Note that building jetson-inference, and Realsense SDK on the Nano require increasing your swap size, beyond the 2GB standard. Otherwise, the Jetson freezes once memory paging leads to swap death.

Anyway, since the objective is remote human viewing, (while providing depth information for the robot to use), the next step will require some more tests, to find a suitable option.

The main blocker is the power usage limitations on the Jetson Nano. I can’t seem to run Wifi and the camera at the same time. According to the tegrastats utility, the POM_5V_IN usage goes over the provided 4A, under basic usage. There are notes saying that 3A can be provided to 2 of the 5V GPIO pins, in order to get 6A total input. That might end up being necessary.

Initial investigation into serving RTSP resulted in inferior, and compressed results compared to a simple python server streaming image by image. The next investigation will be into WebRTC options, which are supposedly the current state of the art, for browser based video streaming. I tried aiortc, and momo, so far, both failed on the Nano.

I’ve decided to try on the Xavier NX, too, just to replicate the experiment, and see how things change. The Xavier has some higher wattage settings, and the wifi is internal, so worth a try. Also, upgraded to Jetpack 5.0.2, which was a gamble. Thought surely it would be better than upgrading to a 5.0.1 dev preview, but none of their official products support 5.0.2 yet, so there will likely be much pain involved. On the plus side, python 3.8 is standard, so some libraries are back on the menu.

On the Xavier, we’re getting 80 FPS, compared to 25 FPS on the Nano. Quite an upgrade. Also, able to run wifi and realsense at the same time.

Looks like a success. Getting multiple frames per second with about a second of lag over the network.

Categories
dev envs Linux OpenCV

Installing OpenCV on Jetson

I have the Jetson NX, and I tried the last couple of days to install OpenCV, and am still fighting it. But we’re going to give it a few rounds.

Let’s see, so I used DustyNV’s Dockerfile with an OpenCV setup for 4.4 or 4.5.

But the build dies, or is still missing libraries. There’s a bunch of them, and as I’m learning, everything is a linker issue. Everything. sudo ldconfig.

Here’s a 2019 quora answer to “What is sudo ldconfig in linux?”

“ldconfig updates the cache for the linker in a UNIX environment with libraries found in the paths specified in “/etc/ld.so.conf”. sudo executes it with superuser rights so that it can write to “/etc/ld.so.cache”.

You usually use this if you get errors about some dynamically linked libraries not being found when starting a program although they are actually present on the system. You might need to add their paths to “/etc/ld.so.conf” first, though.” – Marcel Noe

So taking my own advice, let’s see:

chicken@chicken:/etc/ld.so.conf.d$ find | xargs cat $1
cat: .: Is a directory
/opt/nvidia/vpi1/lib64
/usr/local/cuda-10.2/targets/aarch64-linux/lib
# Multiarch support
/usr/local/lib/aarch64-linux-gnu
/lib/aarch64-linux-gnu
/usr/lib/aarch64-linux-gnu
/usr/lib/aarch64-linux-gnu/libfakeroot
# libc default configuration
/usr/local/lib
/usr/lib/aarch64-linux-gnu/tegra
/usr/lib/aarch64-linux-gnu/fakechroot
/usr/lib/aarch64-linux-gnu/tegra-egl
/usr/lib/aarch64-linux-gnu/tegra

Ok. On our host (Jetson), let’s see if we can install it, or access it. It’s Jetpack 4.6.1 so it should have it installed already.

ImportError: libblas.so.3: cannot open shared object file: No such file or directory

cd /usr/lib/aarch64-linux-gnu/
ls -l liblas.so*

libblas.so -> /etc/alternatives/libblas.so-aarch64-linux-gnu

cd /etc/alternatives
ls -l liblas.so*

libblas.so.3-aarch64-linux-gnu -> /usr/lib/aarch64-linux-gnu/atlas/libblas.so.3
libblas.so-aarch64-linux-gnu -> /usr/lib/aarch64-linux-gnu/atlas/libblas.so

Let’s see that location…

chicken@chicken:/usr/lib/aarch64-linux-gnu/atlas$ ls -l
total 22652
libblas.a
libblas.so -> libblas.so.3.10.3
libblas.so.3 -> libblas.so.3.10.3
libblas.so.3.10.3
liblapack.a
liblapack.so -> liblapack.so.3.10.3
liblapack.so.3 -> liblapack.so.3.10.3
liblapack.so.3.10.3

And those are shared objects.

So why do we get ‘libblas.so.3: cannot open shared object file: No such file or directory?’

So let’s try

sudo apt-get install libopenblas-dev liblapack-dev libatlas-base-dev gfortran


Sounds promising. Ha it worked.

chicken@chicken:/usr/lib/aarch64-linux-gnu/atlas$ python3
Python 3.6.9 (default, Dec  8 2021, 21:08:43) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> 

Now let’s try fix DustyNV’s Dockerfile. Oops right, it takes forever to build things, or even to download and install them. So try not to change things early on in the install. So besides, Dusty’s setup already has these being installed. So it’s not that it’s not there. It’s some linking issue.

Ok I start up the NV docker and try import cv2, but

admin@chicken:/workspaces/isaac_ros-dev$ python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/cv2/__init__.py", line 96, in <module>
    bootstrap()
  File "/usr/local/lib/python3.6/dist-packages/cv2/__init__.py", line 86, in bootstrap
    import cv2
ImportError: libtesseract.so.4: cannot open shared object file: No such file or directory

So where is that file?

admin@chicken:/usr/lib/aarch64-linux-gnu$ find | grep tess | xargs ls -l $1
-rw-r--r-- 1 root root 6892600 Apr  7  2018 ./libtesseract.a
lrwxrwxrwx 1 root root      21 Apr  7  2018 ./libtesseract.so -> libtesseract.so.4.0.0
-rw-r--r-- 1 root root     481 Apr  7  2018 ./pkgconfig/tesseract.pc

Well indeed, that plainly doesn’t exist?

admin@chicken:/$ sudo apt-get install libtesseract-dev
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  libarchive13 librhash0 libuv1
Use 'sudo apt autoremove' to remove them.
The following additional packages will be installed:
  libleptonica-dev
The following NEW packages will be installed:
  libleptonica-dev libtesseract-dev
0 upgraded, 2 newly installed, 0 to remove and 131 not upgraded.
Need to get 2,666 kB of archives.
After this operation, 14.1 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://ports.ubuntu.com/ubuntu-ports bionic/universe arm64 libleptonica-dev arm64 1.75.3-3 [1,251 kB]
Get:2 http://ports.ubuntu.com/ubuntu-ports bionic/universe arm64 libtesseract-dev arm64 4.00~git2288-10f4998a-2 [1,415 kB]
Fetched 2,666 kB in 3s (842 kB/s)           
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package libleptonica-dev.
dpkg: warning: files list file for package 'libcufft-10-2' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'cuda-cudart-10-2' missing; assuming package has no files currently installed
(Reading database ... 97997 files and directories currently installed.)
Preparing to unpack .../libleptonica-dev_1.75.3-3_arm64.deb ...
Unpacking libleptonica-dev (1.75.3-3) ...
Selecting previously unselected package libtesseract-dev.
Preparing to unpack .../libtesseract-dev_4.00~git2288-10f4998a-2_arm64.deb ...
Unpacking libtesseract-dev (4.00~git2288-10f4998a-2) ...
Setting up libleptonica-dev (1.75.3-3) ...
Setting up libtesseract-dev (4.00~git2288-10f4998a-2) ...

ImportError: libtesseract.so.4: cannot open shared object file: No such file or directory

admin@chicken:/$ echo $LD_LIBRARY_PATH 
/opt/ros/foxy/install/opt/yaml_cpp_vendor/lib:/opt/ros/foxy/install/lib:/usr/lib/aarch64-linux-gnu/tegra-egl:/usr/local/cuda-10.2/targets/aarch64-linux/lib:/usr/lib/aarch64-linux-gnu/tegra:/opt/nvidia/vpi1/lib64:/usr/local/cuda-10.2/targets/aarch64-linux/lib::/opt/tritonserver/lib

So this sounds like a linker issue to me. We tell the linker where things are, it finds them.

admin@chicken:/$ export LD_LIBRARY_PATH=/usr/lib/aarch64-linux-gnu/atlas/:/usr/lib/aarch64-linux-gnu:/usr/lib/aarch64-linux-gnu/lapack:$LD_LIBRARY_PATH

admin@chicken:/$ sudo ldconfig

Ok, would be bad to have too much hope now. Let’s see… no, of course it didn’t work.

So let’s see what libleptonica and libtesseract-dev set up

./usr/lib/aarch64-linux-gnu/libtesseract.so
./usr/lib/aarch64-linux-gnu/libtesseract.a
./usr/lib/aarch64-linux-gnu/pkgconfig/tesseract.pc

And it wants 

admin@chicken:/$ ls -l ./usr/lib/aarch64-linux-gnu/libtesseract.so
lrwxrwxrwx 1 root root 21 Apr  7  2018 ./usr/lib/aarch64-linux-gnu/libtesseract.so -> libtesseract.so.4.0.0

and yeah it's installed.  

Start-Date: 2022-04-15  14:03:03
Commandline: apt-get install libtesseract-dev
Requested-By: admin (1000)
Install: libleptonica-dev:arm64 (1.75.3-3, automatic), libtesseract-dev:arm64 (4.00~git2288-10f4998a-2)
End-Date: 2022-04-15  14:03:06

This guy has a smart idea, to install them, which is pretty clever. But I tried that already, and tesseract’s build failed, of course. Then it complains about undefined references to jpeg,png,TIFF,zlib,etc. Hmm. All that shit is installed.

/usr/lib/gcc/aarch64-linux-gnu/8/../../../aarch64-linux-gnu/liblept.a(libversions.o): In function `getImagelibVersions':
(.text+0x98): undefined reference to `jpeg_std_error'
(.text+0x158): undefined reference to `png_get_libpng_ver'
(.text+0x184): undefined reference to `TIFFGetVersion'
(.text+0x1f0): undefined reference to `zlibVersion'
(.text+0x21c): undefined reference to `WebPGetEncoderVersion'
(.text+0x26c): undefined reference to `opj_version'

But so here’s the evidence: cv2 is looking for libtesseract.so.4, which doesn’t exist at all. And even if we symlinked it to point to the libtesseract.so file, that just links to libtesseract.so.4.0.0 which is empty.

Ah. Ok I had to sudo apt-get install libtesseract-dev on the Jetson host, not inside the docker!!. Hmm. Right. Cause I’m sharing most of the libs on the host anyway. It’s gotta be on the host.

admin@chicken:/usr/lib/aarch64-linux-gnu$ ls -l *tess*
-rw-r--r-- 1 root root 6892600 Apr  7  2018 libtesseract.a
lrwxrwxrwx 1 root root      21 Apr  7  2018 libtesseract.so -> libtesseract.so.4.0.0
lrwxrwxrwx 1 root root      21 Apr  7  2018 libtesseract.so.4 -> libtesseract.so.4.0.0
-rw-r--r-- 1 root root 3083888 Apr  7  2018 libtesseract.so.4.0.0



admin@chicken:/usr/lib/aarch64-linux-gnu$ python3
Python 3.6.9 (default, Jan 26 2021, 15:33:00) 
[GCC 8.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> exit()


Success.

So now back to earlier, we were trying to run jupyter lab, to try run the camera calibration code again. I added the installation to the dockerfile. So this command starts it up at http://chicken:8888/lab (or name of your computer).

jupyter lab --ip 0.0.0.0 --port 8888 --allow-root &
Needed matplotlib, so just did quick macro install:

%pip install matplotlib

ModuleNotFoundError: No module named 'matplotlib'

Note: you may need to restart the kernel to use updated packages.

K... restart kernel.  Kernel -> Restart.

Ok now I’m going to try calibrate the stereo cameras, since OpenCV is back.

Seems after some successes, the cameras are not creating capture sessions anymore, even from the host. Let’s reboot.

Alright, new topic. Calibrating stereo cameras.

Categories
AI/ML CNNs dev institutes OpenCV Vision

Detectron2

Ran through the nice working jupyter notebook https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=OpLg_MAQGPUT and produced this video

It is the Mask R-CNN algorithm from matterport, ported over by facebook labs, and better maintained. It was forked and fixed up for tourists.

We can train it on the robot eye view camera, maybe train it on google images of copyleft chickens and eggs.

I think this looks great, for endowing the robot with a basic “recognition” of the features of classes it’s been exposed to.

https://github.com/facebookresearch/detectron2/tree/master/projects

https://detectron2.readthedocs.io/tutorials/extend.html

Seems I was oblivious to Facebook AI but of course they hire very smart people. I’d sell my soul for $240k/yr too. It is super nice to get a working Jupyter Notebook. Thank you. https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/

Here are the other FB project using detectron2, copy pasted:

Projects by Facebook

Note that these are research projects, and therefore may not have the same level of support or stability as detectron2.

External Projects

External projects in the community that use detectron2:

Also, more generally, https://ai.facebook.com/research/#recent-projects

Errors encountered while attempting to install https://detectron2.readthedocs.io/tutorials/getting_started.html

File "demo.py", line 8, in
import tqdm
ImportError: No module named tqdm

pip3 uninstall tqdm
pip3 install tqdm

Ok so…

python3 -m pip install -e .

python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

Requires pyyaml>=5.1

ok

pip install pyyaml==5.1
 Successfully built pyyaml
Installing collected packages: pyyaml
Attempting uninstall: pyyaml
Found existing installation: PyYAML 3.12
ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

pip3 install --ignore-installed PyYAML
Successfully installed PyYAML-5.1

Next error...

ModuleNotFoundError: No module named 'torchvision'

pip install torchvision

Next error...

AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx


ok

python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu


[08/17 20:53:11 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=None, opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl', 'MODEL.DEVICE', 'cpu'], output=None, video_input=None, webcam=True)
[08/17 20:53:12 fvcore.common.checkpoint]: Loading checkpoint from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:53:12 fvcore.common.file_io]: Downloading https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
[08/17 20:53:12 fvcore.common.download]: Downloading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
model_final_f10217.pkl: 178MB [01:26, 2.05MB/s]
[08/17 20:54:39 fvcore.common.download]: Successfully downloaded /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl. 177841981 bytes.
[08/17 20:54:39 fvcore.common.file_io]: URL https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl cached in /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:54:39 fvcore.common.checkpoint]: Reading a file from 'Detectron2 Model Zoo'
0it [00:00, ?it/s]/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
0it [00:06, ?it/s]
Traceback (most recent call last):
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'


Ok...

pip install opencv-python

Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (4.2.0.34)

Looks like 4.3.0 vs 4.2.0.34 kinda thing


sudo apt-get install libopencv-*


nope...

/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)


def nonzero_tuple(x):
"""
A 'as_tuple=True' version of torch.nonzero to support torchscript.
because of https://github.com/pytorch/pytorch/issues/38718
"""
if x.dim() == 0:
return x.unsqueeze(0).nonzero().unbind(1)
return x.nonzero(as_tuple=True).unbind(1)

AttributeError: 'tuple' object has no attribute 'unbind'


https://github.com/pytorch/pytorch/issues/38718

FFS. Why does nothing ever fucking work ?
pytorch 1.6:
"putting 1.6.0 milestone for now; this isn't the worst, but it's a pretty bad user experience."

Yeah no shit.

let's try...

return x.nonzero(as_tuple=False).unbind(1)

Ok next error same

/opt/detectron2/detectron2/modeling/roi_heads/fast_rcnn.py:111


Ok... back to this error (after adding as_tuple=False twice)


 File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'

Decided to check if maybe this is a conda vs pip thing. Like maybe I just need to install the conda version instead?

But it looks like a GTK+ 2.x isn’t installed. Seems I installed it using pip, i.e. pip install opencv-contrib-python and that isn’t built with gtk+2.x. I can also use qt as the graphical interface.

GTK supposedly uses more memory because GTK provides more functionality. Qt does less and uses less memory. If that is your logic, then you should also look at Aura and the many other user interface libraries providing less functionality.” (link )

https://stackoverflow.com/questions/14655969/opencv-error-the-function-is-not-implemented

https://askubuntu.com/questions/913241/error-in-executing-opencv-in-ubuntu

So let’s make a whole new Chapter, because we’re installing OpenCV again! (Why? Because I want to try run the detectron2 demo.py file.)

pip3 uninstall opencv-python
pip3 uninstall opencv-contrib-python 

(or sudo apt-get remove ___)

and afterwards build the opencv package from source code from github.

git clone https://github.com/opencv/opencv.git

cd ~/opencv

mkdir release

cd release

cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_GTK=ON -D WITH_OPENGL=ON ..

make

sudo make install

ok… pls…

python3 demo.py –config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml –webcam –opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu

sweet jaysus finally.

Here’s an image of the network from a medium article on RCNN: https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd

Image for post
Categories
AI/ML CNNs OpenCV Vision

Mask R-CNN

Image for post

Paper: https://arxiv.org/pdf/1703.06870.pdf

FB really likes detecting things. I went with their PyTorch version. The matterport version didn’t work out of the box, so went with FB’s code to try image segmentation.

Caffe2 version: https://github.com/facebookresearch/Detectron

PyTorch version: https://github.com/facebookresearch/Detectron2

Matterport’s version: https://github.com/matterport/Mask_RCNN

Deep Learning based Image Segmentation with OpenCV: https://www.pyimagesearch.com/2018/11/26/instance-segmentation-with-opencv/

https://engineering.matterport.com/splash-of-color-instance-segmentation-with-mask-r-cnn-and-tensorflow-7c761e238b46

Image for post

Also Watershed algorithm is available in OpenCV:

Watershed: http://www.cmm.mines-paristech.fr/~beucher/wtshed.html

Result

Segmenting an image by the watershed transformation is therefore a two-step process:

* Finding the markers and the segmentation criterion (the criterion or function which will be used to split the regions – it is most often the contrast or gradient, but not necessarily).

* Performing a marker-controlled watershed with these two elements.

https://opencv-python-tutroals.readthedocs.io/en/latest/py_tutorials/py_imgproc/py_watershed/py_watershed.html

Categories
OpenCV Vision

Installing OpenCV on computer

OpenCV will be needed, for the robot to make sense of the camera input. We’ll have the camera on a raspberry pi of some sort.

pip install opencv-contrib-python

Someone made a useful website on OpenCV here https://www.pyimagesearch.com/

root@chrx:/opt/imagezmq/tests# python3
Python 3.6.9 (default, Nov  7 2019, 10:44:02) 
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
>>> cv2.__version
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: module 'cv2.cv2' has no attribute '__version'
>>> cv2.__version__
'4.2.0'

Just gotta make sure it uses python3, not python2.

We’ll run it on the raspberry pi (https://www.pyimagesearch.com/2018/09/19/pip-install-opencv/) eventually, but for now, just testing core features on GalliumOS. It’s also required by imagezmq (https://github.com/jeffbass/imagezmq)

The robot has to use its sensors to find chickens, eggs, and decide whether to walk or turn. Stimuli to trigger actions.

pip install zmq

Might also need ‘pip install imutils’

Copying imagezmq.py to the tests folder cause I don’t know shit about python and how to import it from the other folder. So here’s the server:

python3 test_1_receive_images.py

import sys
import cv2
import imagezmq

image_hub = imagezmq.ImageHub()
while True:  # press Ctrl-C to stop image display program
    image_name, image = image_hub.recv_image()
    cv2.imshow(image_name, image)
    cv2.waitKey(1)  # wait until a key is pressed
    image_hub.send_reply(b'OK')

And the client program:

python3 test_1_send_images.py


import sys
import time
import numpy as np
import cv2
import imagezmq

# Create 2 different test images to send
# A green square on a black background
# A red square on a black background

sender = imagezmq.ImageSender()
i = 0
image_window_name = 'From Sender'
while True:  # press Ctrl-C to stop image sending program
    # Increment a counter and print it's value to console
    i = i + 1
    print('Sending ' + str(i))

    # Create a simple image
    image = np.zeros((400, 400, 3), dtype='uint8')
    green = (0, 255, 0)
    cv2.rectangle(image, (50, 50), (300, 300), green, 5)

    # Add counter value to the image and send it to the queue
    cv2.putText(image, str(i), (100, 150), cv2.FONT_HERSHEY_SIMPLEX, 2, (0, 255, 255), 4)
    sender.send_image(image_window_name, image)
    time.sleep(1)

Cool it sent pics to my screen. Heh “Hershey Simplex” sounds more like a virus than a font.