FB really likes detecting things. I went with their PyTorch version. The matterport version didn’t work out of the box, so went with FB’s code to try image segmentation.
Segmenting an image by the watershed transformation is therefore a two-step process:
* Finding the markers and the segmentation criterion (the criterion or function which will be used to split the regions – it is most often the contrast or gradient, but not necessarily).
* Performing a marker-controlled watershed with these two elements.
Please have a look at Chapter 4.3 from the DSO paper, in particular Figure 20 (Geometric Noise). Direct approaches suffer a LOT from bad geometric calibrations: Geometric distortions of 1.5 pixel already reduce the accuracy by factor 10.
Do not use a rolling shutter camera, the geometric distortions from a rolling shutter camera are huge. Even for high frame-rates (over 60fps).
Note that the reprojection RMSE reported by most calibration tools is the reprojection RMSE on the “training data”, i.e., overfitted to the the images you used for calibration. If it is low, that does not imply that your calibration is good, you may just have used insufficient images.
try different camera / distortion models, not all lenses can be modelled by all models.
DSO cannot do magic: if you rotate the camera too much without translation, it will fail. Since it is a pure visual odometry, it cannot recover by re-localizing, or track through strong rotations by using previously triangulated geometry…. everything that leaves the field of view is marginalized immediately.
Currently we have LSD-SLAM working, and that’s cool for us humans to see stuff, but having an object mesh to work with makes more sense. I don’t know if there’s really any difference, but at least in terms of simulator integration, this makes sense. I’m thinking, there’s object detection, semantic segmentation, etc, etc, and in the end, I want the robot to have a relative coordinate system, in a way. But robots will probably get by with just pixels and stochastic magic.
But the big idea for me, here, is transform monocular camera images into mesh objects. Those .obj files or whatever, could be imported into the physics engine, for training in simulation.
What is a ros topic? http://wiki.ros.org/Topics ROS can publish the webcam stream to a “topic”, and any part of the robot can subscribe to it, by name, if it is interested in that data. ROS is almost like a program where everything is a global variable.
On the imagehub side, it finds /root/imagenode.yaml and sets up a folder.
Then the imagenode side, it looks for the directory structure with imagenode/ imagezmq/ and imagenode.yaml in the parent folder. You replace the contents with the YAML examples in the tests folder.
Then when it detects motion in the blue box, it takes pics that arrive in ~/imagehub_data/images/2020-04-12#