Categories
CNNs Vision

EfficientDet

Google also has a new detection algorithm and it looks much faster than Mask RCNN https://ai.googleblog.com/2020/04/efficientdet-towards-scalable-and.html

EfficientNet: Improving Accuracy and Efficiency through AutoML and Model Scaling

Of course the state of the art keeps improving, but this looks like a stepping stone.

EfficientDet

github: https://github.com/google/automl/tree/master/efficientdet

arxiv: https://arxiv.org/pdf/1911.09070.pdf

This is the S.O.T.A. The G.O.A.T. of 2020. So we’ll try it out:

https://heartbeat.fritz.ai/end-to-end-object-detection-using-efficientdet-on-raspberry-pi-3-part-2-bb5133646630

https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087

https://gilberttanner.com/categories/object-detection

TF2 Model Zoo introduces new SOTA models such as CenterNetExtremeNet, and EfficientDet.

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

Categories
Vision

Things and Stuff

INTRODUCTION: THINGS AND STUFF
Ask someone what vision is for and you may get an answer about recognizing objects. Few people will tell you that vision is about recognizing materials. Yet materials are just as important as objects are. Our world involves steel and glass, paper and plastic, food and drink, leather and lace, ice and snow, not to mention blood sweat and tears. Nonetheless, if you peruse the scientific literature in human and machine vision, you will also find a great deal of attention paid to the problem of recognizing objects, and very little to the problem of recognizing materials. Why should this be?


Perhaps it is related to the general preference we have for talking about things rather than stuff.

In detectron2, the term “thing” is used for instance-level tasks, and “stuff” is used for semantic segmentation tasks. Both are used in panoptic segmentation.

Categories
AI/ML Vision

Training, Validation, Testing

I’ve annotated 20 egg images using VIA VGG.

Now I’m going to try train a detectron2 Mask-RCNN. Seems I need to register_coco_instances.

def register_coco_instances(name, metadata, json_file, image_root):

Args: 
name (str): the name that identifies a dataset, e.g. "coco_2014_train". 

metadata (dict): extra metadata associated with this dataset. You can leave it as an empty dict. 

json_file (str): path to the json instance annotation file. 

image_root (str or path-like): directory which contains all the images.

From here: https://tarangshah.com/blog/2017-12-03/train-validation-and-test-sets/

Training Dataset: The sample of data used to fit the model.

Validation Dataset: The sample of data used to provide an unbiased evaluation of a model fit on the training dataset while tuning model hyperparameters. The evaluation becomes more biased as skill on the validation dataset is incorporated into the model configuration.

Test Dataset: The sample of data used to provide an unbiased evaluation of a final model fit on the training dataset.

Debugging…

To show an image with OpenCV, you need to follow it with cv2.waitKey()

cv2.imshow('Eggs',vis.get_image()[:, :, ::-1])
cv2.waitKey()

As I don’t have an NVIDIA card, I needed to set cfg.MODEL.DEVICE=’cpu’

Got some “incompatible shapes” warnings – fair enough.

Since running on cpu, needed this environment variable setting to stop it from using too much memory

LRU_CACHE_CAPACITY=1 python3 eggid.py

Got one “training diverged” with 0.02 learning rate. Changed to 0.001. It freezes a lot. Ubuntu freezes if you use too much memory.

Ok it kept freezing. Going to have to try on Google Colab maybe, or maybe limit python’s memory use. But that would presumably just result in “Memory Error” instead, only slightly less annoying than the computer freezing.

Some guy did object detection, with bounding boxes: https://colab.research.google.com/drive/1BRiFBC06OmWNkH4VpPl8Sf7IT21w7vXr https://www.mrdbourke.com/airbnb-amenity-detection/

Ok, I tried again with Roboflow, but it seems they only support bounding box training, and not the segmentation training I want.

Let’s try training bounding box object detection on the egg dataset…

[09/18 22:53:15 d2.evaluation.coco_evaluation]: Preparing results for COCO format …
[09/18 22:53:15 d2.evaluation.coco_evaluation]: Saving results to ./output/coco_instances_results.json
[09/18 22:53:15 d2.evaluation.coco_evaluation]: Evaluating predictions with unofficial COCO API…
Loading and preparing results…
DONE (t=0.00s)
creating index…
index created!
Running per image evaluation…
Evaluate annotation type bbox
COCOeval_opt.evaluate() finished in 0.00 seconds.
Accumulating evaluation results…
COCOeval_opt.accumulate() finished in 0.01 seconds.
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.595
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.857
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.528
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.501
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.340
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.559
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.469
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.642
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.642
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.500
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.362
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.633
[09/18 22:53:15 d2.evaluation.coco_evaluation]: Evaluation results for bbox:
APAP50AP75APsAPmAPl
59.49385.72152.82950.09934.01055.924
[09/18 22:53:15 d2.evaluation.coco_evaluation]: Per-category bbox AP:
categoryAPcategoryAPcategoryAP
:———–:—–:———–:——-:———–:——-
Eggsnanchicken90.000egg28.987

So I think the training worked, perhaps, on the bounding boxes? Kinda hard to say without seeing it draw some boxes. Not entirely sure what these APs all mean, but are related to “Average Precision”: https://cocodataset.org/#detection-eval

So now, let’s do Google Open Images based training instead. It has a ‘Chicken’ subset, so that’s ideal. So I downloaded https://pypi.org/project/openimages/ and run some python:

from openimages.download import download_dataset
download_dataset("/media/chrx/0FEC49A4317DA4DA/openimages", ["Chicken"], annotation_format="pascal")

Ack this is only bounding boxes too.

Looks like https://pypi.org/project/oidv6/ is another open images downloader script.

Detectron2 needs COCO format, so converting from Pascal VOC to COCO… ?

I looked at this, https://github.com/roboflow-ai/voc2coco – nope, that’s bounding boxes only.

This looks like it might be the biggest format conversion app I’ve found, OpenVINO™ Toolkit

https://docs.openvinotoolkit.org/latest/omz_tools_accuracy_checker_accuracy_checker_annotation_converters_README.html

Ah, it’s got a ™ though because it’s a huge set of software, and I’m running this on a potato. Not an option.

Ok the search continues.

Ok I found https://medium.com/@nicolas.windt/how-to-download-a-subset-of-open-image-dataset-v6-on-ubuntu-using-the-shell-c55336e33b03 and it’s for bounding boxes too, but it is dealing directly with the Google files, so we can probably adjust the commands to parse the segmentation data.

We can find the 'Chicken' category is represented by /m/09b5t:

wget https://storage.googleapis.com/openimages/v5/class-descriptions-boxable.csv

/m/09b5t,Chicken

I would prefer to get instance segmentation training working than bounding box training. But it looks like it’s gonna be a bit harder than anticipated.

At this point, we can download google open images, with some bounding box annotations in the OIDv6 format, and scale them down to 300×300 or similar. We can also get it in Pascal VOC format.

I’ve just set up a user on a friend’s server, and I followed the @nicolas.windt article.

Do I

a) try get Google Tensorflow’s object detection working, as described in @nicolas.windt’s article?

Traceback (most recent call last):
File "/home/danielb/work/models/research/object_detection/dataset_tools/create_oid_tf_record.py", line 45, in
from object_detection.dataset_tools import oid_tfrecord_creation
ImportError: No module named object_detection.dataset_tools

pip install tensorflow-object-detection-api

File "/home/danielb/work/models/research/object_detection/dataset_tools/create_oid_tf_record.py", line 110, in main
image_annotations, label_map, encoded_image)
File "/root/anaconda3/envs/tfRecords/lib/python2.7/site-packages/object_detection/dataset_tools/oid_tfrecord_creation.py", line 43, in tf_example_from_annotations_data_frame
annotations_data_frame.LabelName.isin(label_map)]
File "/root/anaconda3/envs/tfRecords/lib/python2.7/site-packages/pandas/core/generic.py", line 3614, in getattr
return object.getattribute(self, name)
AttributeError: 'DataFrame' object has no attribute 'LabelName'

This has to do with pandas not finding the format it wants. 

---

I'm trying with python 3.8 now, and had to change as_matrix to to_numpy because it was deprecated, and had to change package names to tf.io.xxx



Now 

File "/root/anaconda3/lib/python3.8/site-packages/object_detection/dataset_tools/oid_tfrecord_creation.py", line 71, in tf_example_from_annotations_data_frame
dataset_util.bytes_feature('{}.jpg'.format(image_id)),
File "/root/anaconda3/lib/python3.8/site-packages/object_detection/utils/dataset_util.py", line 30, in bytes_feature
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))
TypeError: '000411001ff7dd4f.jpg' has type str, but expected one of: bytes



So it needs like a to-bytes sort of thing.  [b'a', b'b'] is what stackoverflow came up with.  So needs like [b'000411001ff7dd4f.jpg'] instead of ['000411001ff7dd4f.jpg'

"Convert string to bytes" 
looks like 
b = mystring.encode()

So, 

def bytes_feature(value):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

"Python string encoding is different in Python 2.7 vs 3.6 and it break Tensorflow."

"Hi, where i use encode() ?"  
 - in the https://github.com/tensorflow/models/issues/1597


ok... it's failing here:

standard_fields.TfExampleFields.filename: dataset_util.bytes_feature('{}.jpg'.format(image_id)),


ok and if i use value=value.encode()

TypeError: 48 has type int, but expected one of: bytes

(Ah, ASCII 48 is '0' from '000411001ff7dd4f', so not that.)


and value=[value.encode()] gets
AttributeError: 'bytes' object has no attribute 'encode'
...
but without .encode(),
TypeError: '000411001ff7dd4f.jpg' has type str, but expected one of: bytes


and the data is

feature_map = {
standard_fields.TfExampleFields.object_bbox_ymin:
dataset_util.float_list_feature(
filtered_data_frame_boxes.YMin.to_numpy()),
standard_fields.TfExampleFields.object_bbox_xmin:
dataset_util.float_list_feature(
filtered_data_frame_boxes.XMin.to_numpy()),
standard_fields.TfExampleFields.object_bbox_ymax:
dataset_util.float_list_feature(
filtered_data_frame_boxes.YMax.to_numpy()),
standard_fields.TfExampleFields.object_bbox_xmax:
dataset_util.float_list_feature(
filtered_data_frame_boxes.XMax.to_numpy()),
standard_fields.TfExampleFields.object_class_text:
dataset_util.bytes_list_feature(
filtered_data_frame_boxes.LabelName.to_numpy()),
standard_fields.TfExampleFields.object_class_label:
dataset_util.int64_list_feature(
filtered_data_frame_boxes.LabelName.map(lambda x: label_map[x])
.to_numpy()),
standard_fields.TfExampleFields.filename:
dataset_util.bytes_feature('{}.jpg'.format(image_id)),
standard_fields.TfExampleFields.source_id:
dataset_util.bytes_feature(image_id),
standard_fields.TfExampleFields.image_encoded:
dataset_util.bytes_feature(encoded_image),
}


and the input file looks like... 

ImageID,Source,LabelName,Confidence,XMin,XMax,YMin,YMax,IsOccluded,IsTruncated,IsGroupOf,IsDepiction,IsInside
00e71a70a2f669ff,xclick,/m/09b5t,1,0.18049793,0.95435685,0.056603774,0.9638365,0,1,0,0,0
01463f5494340d3d,xclick,/m/09b5t,1,0,0.59791666,0.2125,0.965625,0,0,0,0,0

ok screw it.  stackoverflow time.
https://stackoverflow.com/questions/64072148/typeerror-has-type-str-but-expected-one-of-bytes

Looks like it's a current bug: https://github.com/tensorflow/models/issues/7997

ok turns out I actually worked it out yesterday with .encode('utf-8'), but it went on to the same bug on the next line.

Ok now it generated some TFRecords.

So now we can train it...






As explained here: https://towardsdatascience.com/custom-object-detection-using-tensorflow-from-scratch-e61da2e10087

The models directory came with a notebook file (.ipynb) that we can use to get inference with a few tweaks. It is located at models/research/object_detection/object_detection_tutorial.ipynb. Follow the steps below to tweak the notebook:

  1. MODEL_NAME = 'ssd_mobilenet_v2_coco_2018_03_29'
  2. PATH_TO_CKPT = 'path/to/your/frozen_inference_graph.pb'
  3. PATH_TO_LABELS = 'models/annotations/label_map.pbtxt'
  4. NUM_CLASSES = 1
  5. Comment out cell #5 completely (just below Download Model)
  6. Since we’re only testing on one image, comment out PATH_TO_TEST_IMAGES_DIR and TEST_IMAGE_PATHS in cell #9 (just below Detection)
  7. In cell #11 (the last cell), remove the for-loop, unindent its content, and add path to your test image:

imagepath = 'path/to/image_you_want_to_test.jpg

After following through the steps, run the notebook and you should see the corgi in your test image highlighted by a bounding box!

Image for post

or

b) Install pytorch, detectron2 (i keep thinking deceptron2), convert OIDv6 or Pascal VOC formats to COCO format (or ssh rsync the egg data files over to the new machine), and train Mask-RCNN, like with the eggs dataset? (I am using my friend’s server because my laptop can’t handle the training. Keeps freezing.)

or

c) Get EfficientDet running: Strangely, https://github.com/google/automl only contains EfficientDet. Is that AutoML? EfficientDet? Surely not. Odd.

Ok…

At this point i’m ok with just trying to get anything working. Bounding boxes. Ok. After an hour of just looking at options, probably B.

Ended up doing A. Seems Google just got Tensorflow 2’s Object detection API working working recently: https://blog.tensorflow.org/2020/07/tensorflow-2-meets-object-detection-api.html

TF2 is based on top of Keras. From what I can tell so far, the main difference between TF2 and PyTorch is that you can modify neural architecture at runtime with PyTorch. But TF2 has Keras, which has an elegant way to describe neural network architecture in code

So, one thing to note, is that when I decide to attempt object segmentation again, the process will probably follow @nicolas.windt’s tutorial but with this file instead (for train- and test- and validate-). https://storage.googleapis.com/openimages/v5/train-annotations-object-segmentation.csv

For now, got the images, and will try train with the TF2 OD-API, starting with one of the models in the zoo: https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md

https://www.kaggle.com/ronyroy/10-mins-or-less-training-and-inference-tf2-object

https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/configuring_jobs.md

Ok so let’s try edit a config

model {
(... Add model config here...)
}

train_config : {
(... Add train_config here...)
}

train_input_reader: {
(... Add train_input configuration here...)
}

eval_config: {
}

eval_input_reader: {
(... Add eval_input configuration here...)
}

Categories
Vision

Annotating COCO

https://github.com/jsbroks/coco-annotator
https://github.com/drainingsun/ybat

https://github.com/visipedia/annotation_tools


https://www.simonwenkel.com/2019/07/19/list-of-annotation-tools-for-machine-learning-research.html

Roboflow seems to be a company with LabelImg and CVAT projects for annotating, but it sounds like it saves as VOC and then you can run a script to make a COCO JSON out of the VOC XML.

https://towardsdatascience.com/how-to-train-detectron2-on-custom-object-detection-data-be9d1c233e4

Going to try work around using Roboflow though, and save directly to COCO somehow. They have “pre-processing” (resize) and “augmentation” (flip the pictures around every which way to generate more data).

Good video about COCO format: https://www.immersivelimit.com/tutorials/create-coco-annotations-from-scratch

This looks good: https://gitlab.com/vgg/via http://www.robots.ox.ac.uk/~vgg/software/via/

maybe this https://github.com/wkentaro/labelme (as suggested here: https://www.dlology.com/blog/how-to-create-custom-coco-data-set-for-instance-segmentation/)

Ok, VIA VGG.

Ok it turns out version 2 is a lot better than version 3:

via-master/via-2.x.y/src/index.html

Working pretty well. Annotation is a bit confusing.

After watching a youtube video, the trick is to name an attribute, like ‘type’, and then add options for ‘egg’ and ‘chicken’ and then select as a dropdown. Then you can set the type attributes by clicking on the shape and selecting ‘egg’ or ‘chicken’

here’s a paper on making like a photoshop style magic selector https://arxiv.org/pdf/1903.10830.pdf for human annotators

Also found “Open Labeler” https://github.com/Cartucho/OpenLabeling which looks pretty good.

Categories
Vision

CV Datasets

Google’s Open Images: https://storage.googleapis.com/openimages/web/index.html

OOOH Chicken segmentation dataset: https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&c=%2Fm%2F09b5t&r=false

https://public.roboflow.com/

Raccoon dataset: https://public.roboflow.com/object-detection/raccoon

COCO Stuff: https://github.com/nightrome/cocostuff

Good overview:
https://www.xailient.com/post/collecting-data-for-custom-object-detection

  • ImageNet
  • COCO
  • Open Images
  • MNIST
  • Cityscapes
Categories
AI/ML institutes Vision

AdelaiDet

https://github.com/aim-uofa/AdelaiDet/

AdelaiDet is an open source toolbox for multiple instance-level recognition tasks on top of Detectron2. All instance-level recognition works from our group are open-sourced here.

To date, AdelaiDet implements the following algorithms:

Looks like University of Adelaide is a hot spot for AI in Australia.

Categories
AI/ML Behaviour chicken research CNNs Vision

Egg ID

This is a notably relevant paper from 2019, that appears to be keeping track of eggs

“Our custom SSD object detection and classification model classified when chickens and eggs were detected by the video camera. Our models can label video frames with classifications for 8 breeds of chickens and 4 colors of eggs, with 98% accuracy on chickens or eggs alone and 82.5% accuracy while detecting both types of objects.”


“Tuned accuracy is needed for proper thresholding of object detection”

https://scholar.smu.edu/cgi/viewcontent.cgi?article=1073&context=datasciencereview (https://scholar.smu.edu/datasciencereview/vol2/iss1/20/)

Also interesting,

Factors Affecting Egg Production in Backyard Chicken
Flocks

https://edis.ifas.ufl.edu/pdffiles/ps/ps02900.PDF

Categories
AI/ML envs Vision

COCO, ShapeNet, Pix3d

These are some examples of data sets, for different reasons

https://cocodataset.org/https://cocodataset.org/

https://www.shapenet.org/about

http://pix3d.csail.mit.edu/

Torchvision:

torchvision.datasets

Categories
3D Research AI/ML CNNs deep Vision

Mesh R-CNN

This https://github.com/facebookresearch/meshrcnn is maybe getting closer to holy grail in my mind. I like the idea of bridging the gap between simulation and reality in the other direction too. By converting the world into object meshes. Real2Sim.

The OpenAI Rubik’s cube hand policy transfer was done with camera in simulation and camera in real world. This could allow a sort of dreaming, i.e., running simulations on new 3d obj data.)

It could acquire data that it could mull over, when chickens are asleep.

PyTorch3d: https://arxiv.org/pdf/2007.08501.pdf

Pixel2Mesh: Generating 3D Mesh Models
from Single RGB Images https://arxiv.org/pdf/1804.01654.pdf

Remember Hinton’s dark knowledge. The trick is having a few models distill into one.

In trying to get Mesh R-CNN working, I had to add DEVICE=CPU to the config.

python3 demo/demo.py --config-file configs/pix3d/meshrcnn_R50_FPN.yaml --input /home/chrx/Downloads/chickenegg.jpg --output output_demo --onlyhighest MODEL.WEIGHTS meshrcnn://meshrcnn_R50.pth

Success! It’s a chair.

There’s no chicken category in Pix3d. But getting closer. Just need a chicken and egg dataset.

Downloading blender again, to check out the obj file that was generated. Ok Blender doesn’t want to show it, but here’s a handy site https://3dviewer.net/ to view OBJ files. The issue in blender required selecting the obj, then View > Frame Selected to make it zoom in. Switching to orthographic from perspective view also helps.

Chair is a pretty adaptable class.

Categories
AI/ML CNNs dev institutes OpenCV Vision

Detectron2

Ran through the nice working jupyter notebook https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=OpLg_MAQGPUT and produced this video

It is the Mask R-CNN algorithm from matterport, ported over by facebook labs, and better maintained. It was forked and fixed up for tourists.

We can train it on the robot eye view camera, maybe train it on google images of copyleft chickens and eggs.

I think this looks great, for endowing the robot with a basic “recognition” of the features of classes it’s been exposed to.

https://github.com/facebookresearch/detectron2/tree/master/projects

https://detectron2.readthedocs.io/tutorials/extend.html

Seems I was oblivious to Facebook AI but of course they hire very smart people. I’d sell my soul for $240k/yr too. It is super nice to get a working Jupyter Notebook. Thank you. https://ai.facebook.com/blog/-detectron2-a-pytorch-based-modular-object-detection-library-/

Here are the other FB project using detectron2, copy pasted:

Projects by Facebook

Note that these are research projects, and therefore may not have the same level of support or stability as detectron2.

External Projects

External projects in the community that use detectron2:

Also, more generally, https://ai.facebook.com/research/#recent-projects

Errors encountered while attempting to install https://detectron2.readthedocs.io/tutorials/getting_started.html

File "demo.py", line 8, in
import tqdm
ImportError: No module named tqdm

pip3 uninstall tqdm
pip3 install tqdm

Ok so…

python3 -m pip install -e .

python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

Requires pyyaml>=5.1

ok

pip install pyyaml==5.1
 Successfully built pyyaml
Installing collected packages: pyyaml
Attempting uninstall: pyyaml
Found existing installation: PyYAML 3.12
ERROR: Cannot uninstall 'PyYAML'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.

pip3 install --ignore-installed PyYAML
Successfully installed PyYAML-5.1

Next error...

ModuleNotFoundError: No module named 'torchvision'

pip install torchvision

Next error...

AssertionError:
Found no NVIDIA driver on your system. Please check that you
have an NVIDIA GPU and installed a driver from
http://www.nvidia.com/Download/index.aspx


ok

python3 demo.py --config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --webcam --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu


[08/17 20:53:11 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=None, opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl', 'MODEL.DEVICE', 'cpu'], output=None, video_input=None, webcam=True)
[08/17 20:53:12 fvcore.common.checkpoint]: Loading checkpoint from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:53:12 fvcore.common.file_io]: Downloading https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
[08/17 20:53:12 fvcore.common.download]: Downloading from https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl …
model_final_f10217.pkl: 178MB [01:26, 2.05MB/s]
[08/17 20:54:39 fvcore.common.download]: Successfully downloaded /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl. 177841981 bytes.
[08/17 20:54:39 fvcore.common.file_io]: URL https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl cached in /root/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[08/17 20:54:39 fvcore.common.checkpoint]: Reading a file from 'Detectron2 Model Zoo'
0it [00:00, ?it/s]/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)
0it [00:06, ?it/s]
Traceback (most recent call last):
File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'


Ok...

pip install opencv-python

Requirement already satisfied: opencv-python in /usr/local/lib/python3.6/dist-packages (4.2.0.34)

Looks like 4.3.0 vs 4.2.0.34 kinda thing


sudo apt-get install libopencv-*


nope...

/opt/detectron2/detectron2/layers/wrappers.py:226: UserWarning: This overload of nonzero is deprecated:
nonzero()
Consider using one of the following signatures instead:
nonzero(*, bool as_tuple) (Triggered internally at /pytorch/torch/csrc/utils/python_arg_parser.cpp:766.)
return x.nonzero().unbind(1)


def nonzero_tuple(x):
"""
A 'as_tuple=True' version of torch.nonzero to support torchscript.
because of https://github.com/pytorch/pytorch/issues/38718
"""
if x.dim() == 0:
return x.unsqueeze(0).nonzero().unbind(1)
return x.nonzero(as_tuple=True).unbind(1)

AttributeError: 'tuple' object has no attribute 'unbind'


https://github.com/pytorch/pytorch/issues/38718

FFS. Why does nothing ever fucking work ?
pytorch 1.6:
"putting 1.6.0 milestone for now; this isn't the worst, but it's a pretty bad user experience."

Yeah no shit.

let's try...

return x.nonzero(as_tuple=False).unbind(1)

Ok next error same

/opt/detectron2/detectron2/modeling/roi_heads/fast_rcnn.py:111


Ok... back to this error (after adding as_tuple=False twice)


 File "demo.py", line 118, in
cv2.namedWindow(WINDOW_NAME, cv2.WINDOW_NORMAL)
cv2.error: OpenCV(4.3.0) /io/opencv/modules/highgui/src/window.cpp:634: error: (-2:Unspecified error) The function is not implemented. Rebuild the library with Windows, GTK+ 2.x or Cocoa support. If you are on Ubuntu or Debian, install libgtk2.0-dev and pkg-config, then re-run cmake or configure script in function 'cvNamedWindow'

Decided to check if maybe this is a conda vs pip thing. Like maybe I just need to install the conda version instead?

But it looks like a GTK+ 2.x isn’t installed. Seems I installed it using pip, i.e. pip install opencv-contrib-python and that isn’t built with gtk+2.x. I can also use qt as the graphical interface.

GTK supposedly uses more memory because GTK provides more functionality. Qt does less and uses less memory. If that is your logic, then you should also look at Aura and the many other user interface libraries providing less functionality.” (link )

https://stackoverflow.com/questions/14655969/opencv-error-the-function-is-not-implemented

https://askubuntu.com/questions/913241/error-in-executing-opencv-in-ubuntu

So let’s make a whole new Chapter, because we’re installing OpenCV again! (Why? Because I want to try run the detectron2 demo.py file.)

pip3 uninstall opencv-python
pip3 uninstall opencv-contrib-python 

(or sudo apt-get remove ___)

and afterwards build the opencv package from source code from github.

git clone https://github.com/opencv/opencv.git

cd ~/opencv

mkdir release

cd release

cmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/usr/local -D WITH_TBB=ON -D BUILD_NEW_PYTHON_SUPPORT=ON -D WITH_V4L=ON -D INSTALL_C_EXAMPLES=ON -D INSTALL_PYTHON_EXAMPLES=ON -D BUILD_EXAMPLES=ON -D WITH_QT=ON -D WITH_GTK=ON -D WITH_OPENGL=ON ..

make

sudo make install

ok… pls…

python3 demo.py –config-file ../configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml –webcam –opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl MODEL.DEVICE cpu

sweet jaysus finally.

Here’s an image of the network from a medium article on RCNN: https://medium.com/@hirotoschwert/digging-into-detectron-2-47b2e794fabd

Image for post